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Preface 



This volume contains the proceedings of the LATIN 2000 International Conference 
(Latin American Theoretical INformatics), to be held in Punta del Este, Uruguay, April 
10-14, 2000. 

This is the fourth event in the series following Sao Paulo, Brazil (1992), Valparaiso, 
Chile (1995), and Campinas, Brazil (1998). LATIN has established itself as a fully 
refereed conference for theoretical computer science research in Latin America. It has 
also strengthened the ties between local and international scientific communities. We 
believe that this volume reflects the breadth and depth of this interaction. 

We received 87 submissions, from 178 different authors in 26 different countries. 
Each paper was assigned to three program committee members. The Program Committee 
selected 42 papers based on approximately 260 referee reports. In addition to these 
contributed presentations, the conference included six invited talks. 

The assistance of many organizations and individuals was essential for the success 
of this meeting. We would like to thank all of our sponsors and supporting organizations. 
Ricardo Baeza- Yates, Claudio Lucchesi, Amaldo Moura, and Imre Simon provided in- 
sightful advice and shared with us their experiences as organizers of previous LATIN 
meetings. Joaquin Goyoaga and Patricia Corbo helped in the earliest stages of the or- 
ganization in various ways, including finding Uruguayan sources of financial support. 
SeCIU (Servicio Central de Informatica Universitario, Universidad de la Republica) pro- 
vided us with the necessary communication infrastructure. The meeting of the program 
committee was hosted by the Institute de Matematica e Estatistica, Universidade de Sao 
Paulo, which also provided us with the Intranet site for discussions among PC members. 
We thank the researchers of the Institute for their collaboration, and in particular, Amaldo 
Mandel for the Intranet setup. Finally, we thank Springer- Verlag for their commitment 
in publishing this and previous LATIN proceedings in the Lecture Notes in Computer 
Science series. 

We are encouraged by the positive reception and interest that LATIN 2000 has created 
in the community, partly indicated by a record number of submissions. 



January 2000 



Gaston Gonnet 
Daniel Panario 
Alfredo Viola 




The Conference 



Invited Speakers 

Allan Borodin (Canada) 

Philippe Flajolet (France) 

Joachim von zur Gathen (Germany) 

Program Committee 

Ricardo Baeza- Yates (Chile) 

Bela Bollobas (USA) 

Felipe Cucker (Hong Kong) 

Josep Diaz (Spain) 

Esteban Feuerstein (Argentina) 
Celina M. de Figueiredo (Brazil) 
Gaston Gonnet (Switzerland, Chair) 
Jozef Gmska (Czech Republic) 

Joos Heintz (Argentina/ Spain) 
Gerard Huet (France) 

Marcos Kiwi (Chile) 

Ming Li (Canada) 

Claudio L. Lucchesi (Brazil) 

Ron Mullin (Canada) 

Organizing Committee 

Ed Coffman Jr. 

Cristina Comes 
Javier Molina 
Laura Molina 
Lucia Moura 



Yoshiharu Kohayakawa (Brazil) 
Andrew Odlyzko (USA) 
Prabhakar Raghavan (USA) 



Ian Munro (Canada) 

Daniel Panario (Canada) 
Dominique Perrin (France) 
Patricio Poblete (Chile) 

Bruce Reed (France) 

Bruce Richmond (Canada) 
Vojtech Rbdl (USA) 

Imre Simon (Brazil) 

Neil Sloane (USA) 

Endre Szmeredi (USA) 
Alfredo Viola (Uruguay) 
Yoshiko Wakabayashi (Brazil) 
Siang Wun Song (Brazil) 
Nivio Ziviani (Brazil) 



Daniel Panario (Co-Chair) 
Alberto Pardo 
Luis Sierra 

Alfredo Viola (Co-Chair) 



Local Arrangements 

The local arrangements for the conference were handled by IDEAS S.R.L. 

Organizing Institutions 

Instituto de Computacion (Universidad de la Republica Oriental del Uruguay) 
Pedeciba Informatica 




VIII The Conference 



Sponsors and Supporting Organizations 

CLEI (Centro Latinoamericano de Estudios en Informatica) 

CSIC (Comision Sectorial de Investigacion Cientifica, Universidad de la Republica) 
CONICYT (Consejo Nacional de Investigaciones Cientificas y Tecnicas) 

UNESCO 

Universidad ORT del Uruguay 
Tecnologia Informatica 




The Conference 



IX 



Referees 

Carme Alvarez 
Andre Arnold 
Juan Carlos Augusto 
Valmir C. Barbosa 
Alejandro Bassi 
Gabriel Baum 
Denis Bechet 
Leopoldo Bertossi 
Ralf Bomdoerfer 
Claudson Bomstein 
Richard Brent 
Veronique Bruy ere 
Hector Cancela 
Rodney Can eld 
Jianer Chen 
Chirstian Choffrut 
Jose Coelho de Pina Jr. 
Ed Coffman Jr. 

Don Coppersmith 
Cristina Comes 
Bruno Courcelle 
Gustavo Crispino 
Maxime Crochemore 
Diana Cukierman 
Joe Culberson 
Ricardo Dahab 
Celia Picinin de Mello 
Erik Demaine 
Nachum Dershowitz 
Luc Devroye 
Volker Diekert 
Luis Dissett 
Juan V. Echagiie 
David Eppstein 
Mart'n Farach-Colton 
Paulo Feo lof f 
Henning Femau 
Cristina G. Fernandes 
W. Fernandez de la Vega 
Carlos E. Ferreira 
Marcelo Fr'as 
Zoltan Furedi 



Joaquim Gabarro 
Juan Garay 
Mark Giesbrecht 
Eduardo Gimenez 
Bernard Gittenberger 
Raul Gouet 
Qian-Ping Gu 
Marisa Gutierrez 
Ryan Hayward 
Ch'nh T. Hoang 
Delia Kesner 
Ayman Khalfalah 
Yoshiharu Kohayakawa 
Teresa Krick 
Eyal Kushilevitz 
Anton n Kucera 
Imre Leader 
Hanno Lefmann 
Sebastian Leipert 
Stefano Leonardi 
Sachin Lodha 
F. Javier Lopez 
Hosam M. Mahmoud 
A. Marchetti-Spaccamela 
Claude Marche 
Amaldo Maude 1 
Mart'n Matamala 
Guillermo Matera 
Jacques Mazoyer 
Alberto Mendelzon 
Ugo Montanari 
Fran9ois Morain 
Petra Mutzel 
Rajagopal Nagarajan 
Gonzalo Navarro 
Marden Neubert 
Cyril Nicaud 
Takao Nishizeki 
Johan Nordlander 
Alfredo Olivero 
Alberto Pardo 
Jordi Petit 



Wojciech Plandowski 
Libor Polak 
Pavel Pudlak 
Davood Ra ei 
Ivan Rapaport 
Mauricio G.C. Resende 
Celso C. Ribeiro 
Alexander Rosa 
Salvador Roura 
Andrzej Rucinski 
Juan Sabia 
Philippe Schoebelen 
Maria Serna 
Oriol Serra 
Jiri Sgall 

Guillermo R. Simari 
Jose Soares 
Pablo Solemo 
Doug Stinson 
Jorge Stol 
keen Stougie 
Jayme Szwarc ter 
Prasad Tetali 
Dimitrios Thilikos 
Soledad Torres 
Luca Trevisan 
Vilmar Trevisan 
Andrew Turpin 
Kristina Vuskovic 
Lusheng Wang 
Sue Whitesides 
Thomas Wilke 
Hugh Williams 
David Williamson 
Fatos Xhafa 
Daniel Yankelevich 
Sheng Yu 
Louxin Zhang 
Binhai Zhu 




Table of Contents 



Random Structures and Algorithms 

Algorithmic Aspects of Regularity (Invited Paper) 1 

Y. Kohayakawa, V. Rodl 

Small Maximal Matchings in Random Graphs 18 

Michele Zito 

Some Remarks on Sparsely Connected Isomorphism-Free Labeled Graphs 28 

Vlady Ravelomanana, Lays Thimonier 

Analysis of Edge Deletion Processes on Faulty Random Regular Graphs 38 

Andreas Goerdt, Mike Molloy 

Equivalent Conditions for Regularity 48 

Y. Kohayakawa, V. Rodl, J. Skokan 

Algorithms I 

Cube Packing 58 

F.K. Miyazawa, Y Wakabayashi 

Approximation Algorithms for Flexible Job Shop Problems 68 

Klaus Jansen, Monaldo Mastrolilli, Roberto Solis-Oba 

Emerging Behavior as Binary Search Trees Are Symmetrically Updated 78 

Stephen Taylor 

The LCA Problem Revisited 88 

Michael A. Bender, Martin Farach-Colton 

Combinatorial Designs 

Optimal and Pessimal Orderings of Steiner Triple Systems in Disk Arrays 95 

Myra B. Cohen, Charles J. Colbourn 

Rank Inequalities for Packing Designs and Sparse Triple Systems 105 

Lucia Moura 

The Anti-Oberwolfach Solution: Pancyclic 2-Factorizations of Complete Graphs . 115 
Brett Stevens 




XII 



Table of Contents 



Web Graph, Graph Theory I 

Graph Structure of the Web: A Survey (Invited Paper) 123 

Prabhakar Raghavan 

Polynomial Time Recognition of Clique- Width <3 Graphs 126 

DerekG. Corneil, Michel Habib, Jean-Marc Lanlignel, Bruce Reed, UdiRotics 

On Dart-Free Perfectly Contractile Graphs 135 

Claudia Linhares Sales, Frederic Maffray 

Graph Theory II 

Edge Colouring Reduced Indifference Graphs 145 

Celina M.H. de Figueiredo, Celia Picinin de Mello, Carmen Ortiz 

Two Conjectures on the Chromatic Polynomial 154 

David Avis, Caterina De Simone, Paolo Nobili 

Finding Skew Partitions Efficiently 163 

Celina M.H. de Figueiredo, Sulamita Klein, Yoshiharu Kohayakawa, 

Bruce A. Reed 

Competitive Analysis, Complexity 

On the Competitive Theory and Practice of Portfolio Selection (Invited Paper) ... 173 
Allan Borodin, Ran El-Yaniv, Vincent Gogan 

Almost A:- Wise Independence and Hard Boolean Functions 197 

Valentine Kabanets 

Improved Upper Bounds on the Simultaneous Messages Complexity of the 

Generalized Addressing Function 207 

Andris Ambainis, Satyanarayana V. Lokam 

Algorithms II 

Multi-parameter Minimum Spanning Trees 217 

David Ferndndez-Baca 

Linear Time Recognition of Optimal L-Restricted Prefix Codes 227 

Ruy Luiz Milidiu, Eduardo Sany Laber 

Uniform Multi-hop All-to-All Optical Routings in Rings 237 

Jaroslav Opatrny 

A Fully Dynamic Algorithm for Distributed Shortest Paths 247 

Serafmo Cicerone, Gabriele Di Stefano, Daniele Frigioni, Umberto Nanni 




Table of Contents XIII 



Computational Number Theory, Cryptography 

Integer Factorization and Discrete Logarithms (Invited Paper) 258 

Andrew Odlyzko 

Communication Complexity and Fourier Coefficients of the Diffie-Hellman Key . 259 
Igor E. Shparlinski 

Quintic Reciprocity and Primality Test for Numbers of the Form M = A5” ± 269 

Pedro Berrizbeitia, Mauricio Odreman Vera, Juan Tena Ayuso 

Determining the Optimal Contrast for Secret Sharing Schemes in Visual 

Cryptography 280 

Matthias Krause, Hans Ulrich Simon 

Analysis of Algorithms I 



Average-Case Analysis of Rectangle Packings 292 

E.G. Coffman, Jr, George S. Lueker, Joel Spencer, Peter M. Winkler 

Heights in Generalized Tries and PATRICIA Tries 298 

Charles Knessl, Wojciech Szpankowski 

On the Complexity of Routing Permutations on Trees by Arc-Disjoint Paths 308 

D. Barth, S. Corteel, A. Denise, D. Gardy, M. Valencia-Pabon 



Algebraic Algorithms 



Subresultants Revisited (Invited Paper) 318 

Joachim von zur Gathen, Thomas Lucking 

A Unifying Framework for the Analysis of a Class of Euclidean Algorithms 343 

Brigitte Vallee 

Worst-Case Complexity of the Optimal LLL Algorithm 355 



Ali Akhavi 

Computability 



Iteration Algebras Are Not Finitely Axiomatizable 367 

Stephen L. Bloom, Zoltdn Esik 

Undecidable Problems in Unreliable Computations 377 

Richard Mayr 



Automata, Formal Languages 

Equations in Free Semigroups with Anti-involution and Their Relation to 

Equations in Free Groups 387 

Claudio Gutierrez 




XIV Table of Contents 



Squaring Transducers: An Efficient Procedure for Deciding Functionality and 

Sequentiality of Transducers 397 

Marie-Pierre Beal, Olivier Carton, Christopher Prieur, Jacques Sakarovitch 

Unambiguous Biichi Automata 407 

Olivier Carton, Max Michel 

Linear Time Language Recognition on Cellular Automata with Restricted 

Communication 417 

Thomas Worsch 

Logic, Programming Theory 

From Semantics to Spatial Distribution 427 

Luis R. Sierra Abbate, Pedro R. D ’Argenio, Juan V. Echagiie 

On the Expressivity and Complexity of Quantitative Branching-Time Temporal 

Logics 437 

F. Laroussinie, Ph. Schnoebelen, M. Turuani 

A Theory of Operational Equivalence for Interaction Nets 447 

Maribel Fernandez, Ian Mackie 

Analysis of Algorithms II 

Run Statistics for Geometrically Distributed Random Variables 457 

Peter Grabner, Arnold Knopfmacher, Helmut Prodinger 

Generalized Covariances of Multi-dimensional Brownian Excursion Local Times . 463 
Guy Louchard 

Combinatorics of Geometrically Distributed Random Variables: Length of 

Ascending Runs 473 

Helmut Prodinger 

Author Index 483 




Algorithmic Aspects of Regularity 



Y. Kohayakawa^* and V. Rodl^ 

^ Institute de Matematica e Estatistica, Universidade de Sao Paulo, 
Rua do Matao 1010, 05508-900 Sao Paulo, Brazil 
yoshiSime . usp . br 

^ Department of Mathematics and Computer Science, 

Emory University, Atlanta, GA, 30322, USA 
rodlOmathcs . emory . edu 



Abstract. Szemeredi’s celebrated regularity lemma proved to be a fun- 
damental result in graph theory. Roughly speaking, his lemma states 
that any graph may be approximated by a union of a bounded number 
of bipartite graphs, each of which is ‘pseudorandom’. As later proved 
by Alon, Duke, Lefmann, Rodl, and Yuster, there is a fast deterministic 
algorithm for finding such an approximation, and therefore many of the 
existential results based on the regularity lemma could be turned into 
constructive results. In this survey, we discuss some recent developments 
concerning the algorithmic aspects of the regularity lemma. 



1 Introduction 

In the course of proving his well known density theorem for arithmetic pro- 
gressions [47], Szemeredi discovered a fundamental result in graph theory. This 
result became known as his regularity lemma [48] . For an excellent survey on this 
lemma, see Komlos and Simonovits [35]. Roughly speaking, Szemeredi’s lemma 
states that any graph may be approximated by a union of a bounded number of 
bipartite graphs, each of which is ‘pseudorandom’. 

Szemeredi’s proof did not provide an efficient algorithm for finding such an 
approximation, but it was later proved by Alon, Duke, Lefmann, Rddl, and 
Yuster [1,2] that such an algorithm does exist. Given the wide applicability of 
the regularity lemma, the result of [1,2] had many consequences. The reader is 
referred to [1,2,14,35] for the first applications of the algorithmic version of the 
regularity lemma. For more recent applications, see [5,6,7,12,17,32,52]. 

If the input graph G has n vertices, the algorithm of [1,2] runs in time 
0{M{n)), where M{n) = is the time needed to multiply two n by n 

matrices with {0, l}-entries over the integers. In [34], an improvement of this is 
given: it is shown that there is an algorithm for the regularity lemma that runs 
in time O(n^) for graphs of order n. 

If one allows randomization, one may do a great deal better, as demonstrated 
by Frieze and Kannan. In fact, they show in [21,22] that there is a randomized 

* Partially supported by FAPESP (Proc. 96/04505-2), by CNPq (Proc. 300334/93-1), 
and by MCT/FINEP (PRONEX project 107/97). 

G. Gonnet, D. Panario, and A. Viola (Eds.): LATIN 2000, LNGS 1776, pp. 1—17, 2000. 

(c) Springer- Verlag Berlin Heidelberg 2000 
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algorithm for the regularity lemma that runs in time 0{n) for n-vertex graphs. 
Quite surprisingly, they in fact show that one may obtain an implicit description 
of the required output in constant time. The key technique here is sampling. 

The regularity lemma has been generalized for hypergraphs in a few differ- 
ent ways; see, e.g., [8,18,19,21,22,42]. One of these generalizations admits con- 
structive versions, both deterministic [13] and randomized [21,22]. Again, the 
consequences of the existence of such algorithms are important. For instance. 
Frieze and Kannan [21,22] prove that all ‘dense’ MAxSNP-hard problems admit 
PTAS, by making use of such algorithms. For other applications of an algorith- 
mic hypergraph regularity lemma, see Czygrinow [11]. 

Let us discuss the organization of this survey. In Section 2, we state the 
regularity lemma for graphs and hypergraphs. In Section 3, we discuss a few 
independent lemmas, each of which allows one to produce algorithms for the 
regularity lemma. In Section 4.1, we state the main result of [34]. Some results 
of Frieze and Kannan are discussed in Section 4.2. In Section 5, we discuss a 
recent result on property testing on graphs, due to Alon, Fischer, Krivelevich, 
and Szegedy [3,4]. The main result in [3,4] is based on a new variant of the 
regularity lemma. We close with some final remarks. 



2 The Regularity Lemma 

In this section we state the regularity lemma and briefly discuss the original 
proof of Szemeredi. In order to be concise, we shall state a hypergraph version 
that is a straightforward extension of the classical lemma. We remark that this 
extension was first considered and applied by Promel and Steger [42]. 



2.1 The Statement of the Lemma 



Given a set V and a non-negative integer r, we write \VY for the collection of 
subsets of V of cardinality r. An r-uniform hypergraph or r-graph on the vertex 
set V is simply a collection of r-sets H C [P]’’. The elements of H are the 
hyperedges of H 

Let Ui,...,Ur C P be pairwise disjoint, non-empty subsets of vertices. The 
density d-uiUi, . . . ,Ur) of this r-tuple with respect % is 



dn{U^,...,Ur) 



e{Ux,...,Ur) 
\U,\...\Ur\ ’ 



( 1 ) 



where e(C/i, . . . , Ur) is the number of hyperedges e £ H with \eC\Ui\ = 1 for 
all 1 < i < r. We say that the r-tuple (C/i, . . . , Ur) is e-regular with respect to % 
if, for all choices of subsets U- C Ui with [[/'[ > e\Ui\ for all 1 < t < r, we have 



\dn{U[,...,U'r)-dH{Ui,...,Ur)\<e. 



(2) 



If for any such C/' (1 < i < r) we have 



\dH{UY...,U'r)-a\<6, 



(3) 
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we say that the r-tuple (Ui,...,Ur) is {a, S) -regular. Finally, we say that a 
partition V = Vb U • • • U Vfe of the vertex set F of "H is e-regular with respect 
to V. if 

(*) |Vb|<£|^|, 

(n) |Fi| = --- = |H|, 

(Hi) at least (1 — e)(^) of the r-tuples (Vi^, . . . , Vi^) with 1 < zi < • • • < < fc 

are e-regular with respect to %. 

Often, Vo is called the exceptional class of the partition. For convenience, we say 
that a partition is {e, k)- equitable if it satisfies (z) and (zz) above. The 

hypergraph version of Szemeredi’s lemma reads as follows. 

Theorem 1 For any integers r > 2 and ko > 1 and real number e > 0, there 
are integers K = K{r,ko,e) and N = N{r,ko,e) such that any r-graph % on a 
vertex set of cardinality at least N admits an {s,k)- equitable e-regular partition 
with ko < k < K. 

Szemeredi [48] considered the case r = 2, that is, the case of graphs. However, 
the proof in [48] generalizes in a straightforward manner to a proof of Theorem 1 
above; see Promel and Steger [42], where the authors prove and apply this result. 



2.2 Brief Outline of Proofs 

Let H be an r-graph on the vertex set V with \V\ = n, and let II = {Vi)^^Q be 
an (e, /c)-equitable partition of V . A crucial definition in Szemeredi’s proof of his 
lemma is the concept of the index ind(il) of 77, given by 

ind(77)=Q (4) 

where the sum is taken over all r-tuples 1 < zi < • • • < v < fc. Clearly, we always 
have 0 < ind(77) < 1. For convenience, if 77' = {Wj)j^Q is an (e', 7)-equitable 
partition of V, we say that 77' is a refinement of 77 if, for any 1 < j < 7, there 
is 1 < z < 7 for which we have Wj G Vi. In words, 77' refines 77 if any non- 
exceptional class of 77' is contained in some non-exceptional class of 77. Now, 
the key lemma in the proof of Theorem 1 is the following (see [42]). 

Lemma 2 For any integer r > 2 and real number e > 0, there exist integers 
ko = ko{r,e) and Uq = no{r,e) and a positive number i9 = i9(r,e) > 0 for 
which the following holds. Suppose we have an r-graph H on a vertex V, with 
n=\V\> no, and 77 = (Vz)fLg is an (e,k)- equitable partition ofV. Then 

(i) either 77 is e-regular with respect to H, 

(zz) or, else, there is a refinement 77' = (Wj)j^o of II such that 
(a) IfFol < \Vo\+n/4\ 

(&) |VFi| = ... = |VF,|, 
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(c) £ = 

(d) ind(TT') > ind(TT) + i}. 

Theorem 1 follows easily from Lemma 2: it suffices to recall that the index 
can never be larger than 1, and hence if we successively apply Lemma 2, starting 
from an arbitrary partition 77, we must arrive at an e-regular partition after a 
bounded number of steps, because (d) of alternative (ii) guarantees that the 
indices of the partitions that we generate always increase by a fixed positive 
amount. 

Ideally, to turn this proof into an efficient algorithm, given a partition 77, we 
would like to have (I) an efficient procedure to check whether (z) applies, and 
(II) if (z) fails, an efficient procedure for finding 77' as specified in (a)-(d). 

It turns out that if we actually have, at hand, witnesses for the failure of 
£-regularity of > e(^) of the r-tuples , . . . , where 1 < Zi < • • • < z^ < A:, 
then 77' may be found easily (see [42] for details). Here, by a witness for the e- 
irregularity of an r-tuple (7/i, . . . , Ur) we mean an r-tuple (7/(, . . . , t/') with [/' C 
Ui and 1 77' I > e\Ui\ for all 1 < z < r for which (2) fails. We are therefore led to 
the following decision problem: 

Problem 3 Given an r -graph H, an r-tuple (Ui, . . . , Ur) of non-empty, pairwise 
disjoint sets of vertices ofH, and a real number e > 0, decide whether this r-tuple 
is e-regular with respect to H. 

In case the answer to Problem 3 is negative for a given instance, we would 
like to have a witness for the £-irregularity of the given r-tuple. 

3 Conditions for Regularity 

3.1 A Hardness Result 

It turns out that Problem 3 is hard, as proved by Alon, Duke, Lefmann, Rodl, 
and Yuster [1,2]. 

Theorem 4 Problem 3 is coNP-complete for r = 2. 

Let us remark in passing that Theorem 4 is proved in [1,2] for the case in 
which £ = 1/2; for a proof for arbitrary 0 < £ < 1/2, see Taraz [49]. 

Theorem 4 is certainly discouraging. Fortunately, however, there is a way 
around. We discuss the graph and hypergraph cases separately. 



The graph case. In the case r = 2, that is, in the case of graphs, one has 
the following lemma. Below K+ denotes the set of positive reals. Moreover, a 
bipartite graph B = {U,W]E) with vertex classes U and W and edge set E 
is said to be £-regular if (U,W) is an £-regular pair with respect to B. Thus, 
a witness to the £-irregularity of 7? is a pair {U',W') with U' C U, W C W, 
\U'\, \W'\ > sn, and {dsiU' ,W') — dB{U,W)\ > e (see the paragraph before 
Problem 3). 
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Lemma 5 There is a polynomial-time algorithm A and a function : M+ — >■ 
M+ such that the following holds. When A receives as input a bipartite graph B = 
([/, W] E) with \U\ = \W\ = n and a real number s > 0, U either correctly asserts 
that B is e-regular, or else it returns a witness for the e'j^{e) -irregularity of B. 

We remark that Lemma 5 implicitly says that e' = £^(e) < e, for otherwise A 
would not be able to handle an input graph B that is not e-regular but is e'- 
regular. In fact, one usually has s' <C e. 

Note that Lemma 5 leaves open what the behaviour of A should be when B 
is e-regular but is not e'-regular. Despite this fact, Lemma 5 does indeed imply 
the existence of a polynomial-time algorithm for finding e-regular partitions of 
graphs. We leave the proof of this assertion as an exercise for the reader. 

In Sections 3.2, 3.3, and 3.4 we state some independent results that imply 
Lemma 5, thus completing the general description of a few distinct ways one 
may prove the algorithmic version of the regularity lemma for graphs. 



The hypergraph case. In the case of r-graphs (r > 3), we do not know a 
result similar to Lemma 5. The algorithmic version of Theorem 1 for r > 3 is 
instead proved by introducing a modified concept of index and then by proving 
a somewhat more complicated version of Lemma 5. For lack of space, we shall 
not go into details; the interested reader is referred to Czygrinow and Rodl [13]. 
In fact, in the remainder of this survey we shall mostly concentrate on graphs. 

Even though several applications are known for the algorithmic version of 
Theorem 1 for r > 3 (see, e.g., [11,12,13,21,22]), it should be mentioned that the 
most powerful version of the regularity lemma for hypergraphs is not the one 
presented above. Indeed, the regularity lemma for hypergraphs proved in [18] 
seems to have deeper combinatorial consequences. 

3.2 The Pair Condition for Regularity 

As mentioned in Section 3.1, Lemma 5 may be proved in a few different ways. 
One technique is presented in this section. The second, which is in fact a gener- 
alization of the methods discussed here, is based on Lemmas 9 and 10. Finally, 
the third method, presented in Section 3.4, is based on a criterion for regularity 
given in Lemma 11. 

Let us turn to the original approach of [1,2]. The central idea here may be 
summarized in two lemmas. Lemmas 6 and 7 below; we follow the formulation 
given in [14] (see also [9,20,50,51] and the proof of the upper bound in Theo- 
rem 15.2 in [16], due to J.H. Lindsey). Below, d{x,x') denotes the joint degree 
or codegree of x and x' , that is, the number of common neighbours of x and x' . 

Lemma 6 Let a constant 0 < £ < 1 &e given and let B = (U,W;E) be a 
bipartite graph with \U\ > 2/s. Let g = dsiU, W) and let D be the collection of 
all pairs {x,x'} of vertices of U for which 



(i) d(x), d(x') > {Q — e)\W\, 
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{ii) d{x,x') < {q + e)'^\W\. 

Then if \D\ > (1/2)(1 — 5e)|kRp, the pair {U,W) is -regular. 



Lemma 7 Let B = {U,W;E) be a graph with (U,W) a {g, e)-regular pair and 
with density d{U, W) = g. Assume that g\W\ > 1 and 0 < e < 1. Then 

(i) all but at most 2e|kL| vertices x & W satisfy 

{g - s)\W\ < d{x), d(x') < (g-h e)\W\, 

(ii) all but at most 2e|C/p pairs {x,x'} of vertices of A satisfy 

d{x, x') < (£» + e)^|kL|. 

It is not difficult to see that Lemmas 6 and 7 imply Lemma 5. Indeed, the 
main computational task that algorithm A from Lemma 5 has to perform is 
to compute the codegrees of all pairs of vertices {x,x') G U x U. Clearly, this 
may be done in time O(n^). Observing that this task may be encoded as the 
squaring of a certain natural {0, l}-matrix over the integers, one sees that there 
is an algorithm A as in Lemma 5 with time complexity where M(n) = 

is the time required to carry out such a multiplication (see [10]). 

Before we proceed, let us observe that the pleasant fact here is the following. 
Although the definition of £-regularity for a pair (U,W) involves a quantification 
over exponentially many pairs (U' , W'), we may essentially check the validity of 
this definition by examining all pairs (x, x') G U x U, of which there are only 
quadratically many. We refer to the criterion for regularity given by Lemmas 6 
and 7 as the pair condition for regularity. 



3.3 An Optimized Pair Condition for Regularity 

Here we state an improved version of the pair condition of Section 3.2. The key 
idea is that it suffices to control the codegrees of a small, suitably chosen set of 
pairs of vertices to guarantee the regularity of a bipartite graph; that is, we need 
not examine all pairs {x,x') G [/ x [/. As it turns out, it suffices to consider the 
pairs that form the edge set of a linear-sized expander, which reduces the number 
of pairs to examine from (”) to 0(n). This implies that there is an algorithm A 
as in Lemma 5 with time complexity O(n^). 

We start with an auxiliary definition. 

Definition 8 Let 0 < g < 1 and A > 0 be given. We say that a graph J on n 
vertices is {g, A)-uniforin if for any pair of disjoint sets U, W C V{J) such that 
1 < |Cf| < |kb| < A^\> where r = gn, we have 

\ej{U,W) - g\U\\W\\< A^r\U\\W\ . 
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Thus, a (g», A)-uniform graph is a graph with density ~ p in which the edges 
are distributed in a random-like manner. One may check that the usual binomial 
random graph G{n,p) is {p, 20)-uniform (see, e.g., [31]). More relevant to us is the 
fact that there are efficiently constructible graphs J that are {g, 0(l))-uniform, 
and have arbitrarily large but constant average degree. Here we have in mind 
the celebrated graphs of Margulis [41] and Lubotzky, Phillips, and Sarnak [39] 
(see also [40,46]). For these graphs, we have A = 2. 

Let us now introduce the variant of the pair condition for regularity that is 
of interest. Let an n by n bipartite graph B = (U, W; E) be given, and suppose 
that J is a graph on U. Write e( J) for the number of edges in J and let p = 
e{B)fn‘^ be the density of B. We say that B satisfies property P{J,S) if 

X! |d(a;,?/) -/n| < (5/ne(J). (5) 

{x,y}£E(J) 

Below, we shall only be interested in the case in which J is (fj, H)-uniform for 
some constant A and p x 1 /n. The results analogous to Lemmas 6 and 7 involv- 
ing property V are as follows. 

Lemma 9 For every e > 0, there exist ro = ro(e), no = no(e) > 1, and 6 = 
(5(e) > 0 for which the following holds. Suppose n > no, the graph J is a {g,A)~ 
uniform graph with g = rfn > rofn, and B is a bipartite graph as above. Then, 
if B has property V{J,S), then B is e-regular. 



Lemma 10 For every (5 > 0, there exist ri = ri((5), ni = ni{S) > 1, and s' = 
e'((5) > 0 for which the following holds. Suppose n > ni, the graph J is a {g,A)~ 
uniform graph with g = r jn > rifn, and B is a bipartite graph as above. Then, 
if B does not satisfy property V{J,6), then B is not e' -regular. Furthermore, 
in this case, we can find a pair of sets of vertices {U' , W) witnessing the s' - 
irregularity of B in time O(n^). 

Lemmas 9 and 10 show that Lemma 5 holds for an algorithm A with time 
complexity O(n^). 

The proof of Lemma 9 is similar in spirit to the proof of Lemma 6, but of 
course one has to make heavy use of the (g, 7l)-uniformity of J. The sparse version 
of the regularity lemma (see, e.g., [33]) may be used to prove the e'-irregularity 
of the graph B in Lemma 10. However, proving that a suitable witness may 
be found in quadratic time requires a different approach. The reader is referred 
to [34] for details. 

3.4 Singular Values and Regularity 

We now present an approach due to Frieze and Kannan [23] . Let an m by n real 
matrix A be given. The first singular value cri(A) of A is 

cri(A) = sup{|a;^A 2 /| : ||x|| = |y|| = 1}. 



(6) 
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Above, we use || | to denote the 2-norm. In what follows, for a matrix W = 
(wij), we let ||W||oo = maxlruj^jj. Moreover, if I and J are subsets of the index 
sets of the rows and columns of W, we let 

W(/,J)= ^ Wi,,=xJWxj, (7) 



where xi Etnd XJ ^tre the {0, l}-characteristic vectors for / and J. 

Let B = {U,W\ E) be a bipartite graph with density p = e(i?)/|C/||IL|, and 
let A = (aij)ig[/,iew be the natural {0, l}-adjacency matrix associated with B. 
Put W = A — pj, where J is the n x n matrix with all entries equal to 1. It is 
immediate to check that the following holds: 

(*) B is ^-regular if and only if \W{U',W')\ < e\U'\\W'\ for all U' C U 
and W C W with \U'\ > e\U\ and |IP'| > e|IP|. 

The regularity condition of Frieze and Kannan [23] is as follows. 

Lemma 11 Let ~W be a matrix whose entries are index by U x W, where \U\ = 
|IF| = n, and suppose that ||W||oo < 1. Let 7 > 0 &e given. Then the following 
assertions hold: 

(z) Lf there exist U' C U and W C W such that \U'\, \W'\ > yn and 

\W{U',W')\>x\U'\\W'\, 

then CTi(W) > y^n. 

(ii) LfaifW) > yn, then there exist U' C U and W C W such that \U'\, \W'\ > 
y'n and |W([/',IF')| > y'|C/'||lF'|, where y' = y^/108. Furthermore, Lf 
and W may be constructed in polynomial time. 

Now, in view of (*) and the fact that singular values may be computed in 
polynomial time (see, e.g., [29]), Lemma 11 provides a proof for Lemma 5. 

4 The Algorithmic Versions of Regularity 

In this section, we discuss the constructive versions of Theorem 1 for the graph 
case, that is r = 2. We discuss deterministic and randomized algorithms sepa- 
rately. 



4.1 Deterministic Algorithms 

Our deterministic result. Theorem 12 below, asserts the existence of an algorithm 
for finding regular partitions that is asymptotically faster than the algorithm due 
to Alon, Duke, Lefmann, Rodl, and Yuster [1,2]. 
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Theorem 12 There is an algorithm A that takes as input an integer ko > 1, 
an £ > 0, and a graph G on n vertices and returns an s-regular {s,k)- equitable 
partition for G with ko < k < K , where K is as Theorem 1. Algorithm A runs 
in time < Gn^ , where G = G(e, ko) depends only on e and ko- 

Theorem 12 follows from the considerations in Sections 3.1 and 3.3 (see [34] 
for details). 

It is easy to verify that there is a constant s' = £'{e, ko) such that if e(G) < 
e'n^, then any (e, /co)-equitable partition is £-regular. Clearly, as the time re- 
quired to read the input is fi{e{G)) and, as observed above, we may assume 
that e(G) > algorithm A in Theorem 12 is optimal, apart from the value 
of the constant G = G(e, ko)- 

A typical application of the algorithmic regularity lemma of [1,2] asserts the 
existence of a fast algorithm for a graph problem. As it turns out, the running 
time of such an algorithm is often dominated by the time required for finding 
a regular partition for the input graph, and hence the algorithm has time com- 
plexity 0(M(n)). In view of Theorem 12, the existence of quadratic algorithms 
for these problems may be asserted. Some examples are given in [34]. 

4.2 Randomized Algorithms 

We have been discussing deterministic algorithms so far. If we allow random- 
ization, as proved by Frieze and Kannan, a great deal more may be achieved in 
terms of efficiency [21,22]. The model we adopt is as follows. We assume that 
sampling a random vertex from G as well as checking an entry of the adjacency 
matrix of G both have unit cost. 

Theorem 13 There is a randomized algorithm Afk that takes as input an in- 
teger ko>^, an £ > 0, a S > 0, and a graph G on n vertices and returns, with 
probability > 1 — <5, an s-regular (s, k)-equitable partition for G with ko < k < K, 
where K is as Theorem 1- Moreover, the following assertions hold: 

(i) Algorithm Afk runs in time < Cn, where G = G{s,S,ko) depends only 
on s, 6, and ko - 

(ii) In fact, Afk first outputs a collection of vertices of G of cardinality < G' , 
where G' = G'{s, S, ko) depends only on s, S, and ko, and the above s-regular 
partition for G may be constructed in linear time from this collection of 
vertices - 

The fact that one is able to construct a regular partition for a graph on n 
vertices in randomized time 0{n) is quite remarkable; this is another evidence of 
the power of randomization. However, even more remarkable is what (ii) implies: 
given £ > 0, (5 > 0, and ko > 1, there is a uniform bound G' = G'(s, S, ko) on the 
number of randomly chosen vertices that will implicitly define for us a suitable 
£-regular partition of the given graph, no matter how large this input graph is- 

Let us try to give a feel on how one may proceed to prove Theorem 13. 
We shall discuss a lemma that is central in the Frieze-Kannan approach. The 
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idea is that, if a bipartite graph B = {U,W; E) is not e-regular, then this may 
be detected, with high probability, by sampling a bounded number of vertices 
of B. The effectiveness of sampling for detecting “dense spots” in graphs appears 
already in Goldreich, Goldwasser, and Ron [25,26], where many constant time 
algorithms are developed (we shall discuss such matters in Section 5 below). 

Let us state the key technical lemma in the proof of Theorem 13. Let an 
n by n bipartite graph B = (U,W;E) be given. Our aim is to check whether 
there exist U' C U and W' C W such that \U'\, \W'\ > en and 

\dB{U' ,W) - dB{U,W)\ >e. 

Note that this last inequality is equivalent to |e([/', IT')— p|C/'||TT'| | > £|C/'||1T'|, 
where p = e{B)/n^ is the density of B. In fact, if 

\e{U',W) -p\U'\\W'\\ > 

holds and 7 > e, then (C/', W) must be a witness for the ^-irregularity of B. 

We are now ready to state the result of Frieze and Kannan that allows one 
to prove a randomized version of Lemma 5. The reader may wish to compare 
this result with Lemmas 7 and 10. 

Lemma 14 There is a randomized algorithm A that behaves as follows. Let B = 
(U,W;E) be as above and let 7 and S > 0 be positive reals. Suppose there ex- 
ist U' CU and W C W for which e{U',W') > p|C/'||IT'| -1-771^ holds. Then, on 
input R, 7 > 0, and (5 > 0, algorithm A determines, with probability > 1 — <5, 
implicitly defined sets U" C U and W C W with 

e{U",W")>p\U"\\W"\ + ^^n^. 

The running time of A is bounded by some constant C = Cfy,S) that depends 
only on 7 and S. 

A few comments concerning Lemma 14 are in order. As before, the model 
here allows for the selection of random vertices of B as well as checking whether 
two given vertices are adjacent in constant time. In order to define the sets U" 
and IT", algorithm A returns two sets Zi C U and Z 2 C IT, both of cardinality 
bounded by some constant depending only on 7 and 5 > 0. Then, U" is simply 
the set of vertices u G C/ for which e(u, Z2) > p|-^2|- The set IT" is defined 
analogously. 

Algorithm A of Lemma 14 is extremely simple, and its elegant proof of cor- 
rectness is based on the linearity of expectation and on well known large devia- 
tion inequalities (see Frieze and Kannan [21]). 

We close this section mentioning a result complementary to Lemma 14 (see 
[15] for a slightly weaker statement). Suppose B = {U ,W ] E) is £-regular. Then 
if C/' C [/ and IT' C IT are randomly chosen sets of vertices with ][/'] = 
jfT'l > Mq = Mo{s',S), then the bipartite graph B' = {U',W'; E') induced 
by B on ([/', IT') is £'-regular with probability > 1 — 5, as long as £ < £o(£', <5). 
Here again the striking fact is that a bounded sample of vertices forms a good 
enough picture of the whole graph. 
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5 Property Testing 

The topic of this section is in nature different from the topics discussed so far. We 
have been considering how to produce algorithms for finding regular partitions of 
graphs. In this section, we discuss how non-constructive versions of the regularity 
lemma may be used to prove the correctness of certain algorithms. We shall 
discuss a recent result to Alon, Fischer, Krivelevich, and Szegedy [3,4]. These 
authors develop a new variant of the regularity lemma and use it to prove a far 
reaching result concerning the testability of certain graph properties. 



5.1 Definitions and the Testability Result 

The general notion of property testing was introduced by Rubinfeld and Su- 
dan [45] , but in the context of combinatorial testing it is the work of Goldreich 
and his co-authors [24,25,26,27,28] that are most relevant to us. 

Let t/" be the collection of all graphs on a fixed n- vertex set, say [n] = 
{l,...,n}.Pute? = U„>i t/”. A property of graphs is simply a subset V C Q that 
is closed under isomorphisms. There is a natural notion of distance in each 5", 
the normalized Hamming distance: the distance d{G,H) = dn{G,H) between 
two graphs G and S is \E{G) A E{H)\{^ where E{G) A E{H) denotes 
the symmetric difference of the edge sets of G and H. 

We say that a graph G is e-far from having property V if 

d{G,V) = min d{G, H) > e, 

that is, at least £( 2 ) edges have to be added or removed to G to turn it into a 
graph that satisfies P. 

An e-test for a graph property P is a, randomized algorithm A that receives as 
input a graph G and behaves as follows: if G has P then with probability >2/3 
we have A{G) = 1, and if G is £-far from having P then with probability >2/3 
we have A{G) = 0. The graph G is given to A through an oracle; we assume 
that A is able to generate random vertices from G and it may query the oracle 
whether two vertices that have been generated are adjacent. 

We say that a graph property P is testable if, for all £ > 0, it admits an £-test 
that makes at most Q queries to the oracle, where Q = Q{s) is a constant that 
depends only on e. Note that, in particular, we require the number of queries to 
be independent of the order of the input graph. 

Goldreich, Goldwasser, and Ron [25,26], besides showing that there exist NP 
graph properties that are not testable, proved that a large class of interesting 
graph properties are testable, including the property of being fc-colourable, of 
having a clique with > gn vertices, and of having a cut with > gn^ edges, where n 
is the order of the input graph. The regularity lemma is not used in [25,26]. The 
fact that fc-colourability is testable had in fact been proved implicitly in [15], 
where regularity is used. 

We are now ready to turn to the result of Alon, Fischer, Krivelevich, and 
Szegedy [3,4]. Let us consider properties from the first order theory of graphs. 
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Thus, we are concerned with properties that may be expressed through quan- 
tification of vertices, Boolean connectives, equality, and adjacency. Of particular 
interest are the properties that may be expressed in the form 

3xi, ...,Xr Vyi, . ..,ys A{xi, ...,Xr,yi,.. .,ys), 

where ^ is a quantifier-free first order expression. Let us call such properties of 
type 3V. Similarly, we define properties of type V3. The main result of [3,4] is as 
follows. 

Theorem 15 All first order properties of graphs that may he expressed with at 
most one quantifier as well as all properties that are of type 3V are testable. 
Furthermore, there exist properties of type V3 that are not testable. 

The first part of the proof of the positive result in Theorem 15 involves the 
reduction, up to testability, of properties of type 3V to a certain generalized 
colourability property. A new variant of the regularity lemma is then used to 
handle this generalized colouring problem. 



5.2 A Variant of the Regularity Lemma 

In this section we shall state a variant of the regularity lemma proved in [3,4]. 

Let us say that a partition U = of a set V is an equipartition of V if 

all the sets Vi {1 <i <k) differ by at most 1 in size. In this section, we shall not 
have exceptional classes in our partitions. Below, we shall have an equipartition 
of V 

n' = {Vij -.i<i<k, 1 < j < £} 

that is a refinement of a given partition 77 = In this notation, we 

understand that, for all i, all the Vij (1 < j < i) are contained in Vi. 

Theorem 16 For every integer ko and every function 0 < e(r) < 1 defined on 
the positive integers, there are constants K = K{ko,e) and N = N{ko,e) with 
the following property. If G is any graph with at least N vertices, then there 
exist equipartitions 77 = (V)i<i<fc and 77' = (Vjj)i<i<fe, i<j<^ of V = V{G) 
such that the following hold: 

(z) |77| = k > ko and |77'| = kl < K; 

{ii) at least (1 — e( 0 ))( 2 ) of the pairs (Vi, Vi/) with 1 < i < i' < k are e(0)- 
regular; 

(Hi) for alll < i < i' < k, we have that at least (1 — of the pairs (Vij, Tj' y') 
with j, f € [£] are e{k) -regular; 

(iv) for at least (1 — e( 0 ))( 2 ) of the pairs 1 < i < i' < k, we have that for at 
least (1 — e(0))7^ of the pairs j, j' € [i] we have 



\da{Vi,V,) - da{V,j,Vu,p)\ < e(0). 
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Suppose we have partitions II and II' as in Theorem 16 above and that 
e{k) <C 1/k. It is not difficult to see that then, for many ‘choice’ functions 
j: [k] — 1 [^], we have that II = {yi,j{i))i<i<k is an equipartition of an induced 
subgraph of G such that the following hold: 

(a) all the pairs (Tij(i), are £(A:)-regular, 

(&) for at least (1 — e( 0 ))( 2 ) of the pairs 1 < z < i' < k, we have 

< e(0). 

In a certain sense, this consequence of Theorem 16 lets us ignore the irregular 
pairs in the partition 77, at the expense of dropping down from the Vi to smaller 
sets hij(i) (still all of cardinality G{n)), and having most but not necessarily all 
densities da{Vi j(^i'f,Vi/ j(^ir-^) under tight control. 

Let us remark in passing that, naturally, one may ask whether Theorem 1 may 
be strengthened by requiring that there should be no irregular pairs altogether. 
This question was already raised by Szemeredi in [48]. As observed by Lovasz, 
Seymour, Trotter, and the authors of [2] (see p. 82 in [2]), such an extension of 
Theorem 1 does not exist. As noted above. Theorem 16 presents a way around 
this difficulty. 

Theorem 16 and its corollary mentioned above are the main ingredients in 
the proof of the following result (see [3,4] for details). 

Theorem 17 For every £ > 0 and h > 1, there is S = 6{s,h) > 0 for which 
the following holds. Let 77 he an arbitrary graph on h vertices and let V = 
Forbind(Tf) he the property of not containing 77 as an induced subgraph. If an 
n-vertex graph G is s- far from V , then G contains Sn^ induced copies of H. 

The case in which 77 is a complete graph follows from the original regular- 
ity lemma, but the general case requires the corollary to Theorem 16 discussed 
above. Note that Theorem 17 immediately implies that the property of member- 
ship in Forbind(Tf) (in order words, the property of not containing an induced 
copy of 77) is a testable property for any graph 77. 

The proof of Theorem 15 requires a generalization of Theorem 17 related to 
the colouring problem alluded to at the end of Section 5.1. We refer the reader 
to [3,4]. We close by remarking that Theorem 16 has an algorithmic version, 
although we stress that this is not required in the proof of Theorem 15. 

6 Concluding Remarks 

We have not discussed a few recent, important results that relate to the regularity 
lemma. We single out three topics that the reader may wish to pursue. 

6.1 Constants 

The constants involved in Theorem 1 are extremely large. The proof in [48] gives 
that K in Theorem 1 is bounded from above by a tower of 2s of height £“®. 




14 



Y. Kohayakawa and V. Rodl 



A recent result of Gowers [30] in fact shows that this cannot be essentially 
improved. Indeed, it is proved in [30] that there are graphs for which any e- 
regular partition must have at least parts, where c > 0 is some absolute 

constant and G{x) is a tower of 2s of height [xj . 

The size of K is very often not too relevant in applications, but in certain 
cases essentially better results may be obtained if one is able to avoid the ap- 
pearance of such huge constants. In view of Gowers’s result, this can only be 
accomplished by modifying the regularity lemma. One early instance in which 
this carried out appears in [14]; a more recent example is [21]. 

6.2 Approximation Schemes for Dense Problems 

Frieze and Kannan have developed variants of the regularity lemma for graphs 
and hypergraphs and discuss several applications in [21,22]. The applications 
are mostly algorithmic and focus on ‘dense’ problems, such as the design of a 
PTAS for the max-cut problem for dense graphs. The algorithmic versions of 
their variants of the regularity lemma play a central role in this approach. 

For more applications of algorithmic regularity to ‘dense’ problems, the 
reader is referred to [11,12,13,32] 

6.3 The Blow-Up Lemma 

We close with an important lemma due to Komlos, Sarkozy, and Szemeredi [37], 
the so-called blow-up lemma. (For an alternative proof of this lemma, see [44].) 

In typical applications of the regularity lemma, once a suitably regular par- 
tition of some given graph G is found, one proceeds by embedding some ‘target 
graph’ H of bounded degree into G. Until recently, the embedding techniques 
could only handle graphs H with many fewer vertices than G. The blow-up 
lemma is a novel tool that allows one to embed target graphs H that even have 
the same number of vertices as G. The combined use of the regularity lemma and 
the blow-up lemma is a powerful new machinery in graph theory. The reader is 
referred to Komlos [36] for a discussion on the applications of the blow-up lemma. 

On the algorithmic side, the situation is good: Komlos, Sarkozy, and Sze- 
meredi [38] have also proved an algorithmic version of the blow-up lemma (see 
Rodl, Rucihski, and Wagner [43] for an alternative proof). 
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Abstract. We look at the minimal size of a maximal matching in general, bipartite 
and d-regular random graphs. We prove that the ratio between the sizes of any 
two maximal matchings approaches one in dense random graphs and random 
bipartite graphs. Weaker bounds hold for sparse random graphs and random d- 
regular graphs. We also describe an algorithm that with high probability hnds a 
matching of size strictly less than n/2 in a cubic graph. The result is based on 
approximating the algorithm dynamics by a system of linear differential equations. 



1 Introduction 

A matching in a graph is a set of disjoint edges. Several optimisation problems are 
definable in terms of matchings. If G is a graph and M is a matching in G, we count the 
number of edges in M and the goal is to maximise this value, then the corresponding 
problem is that of hnding a maximum cardinality matching in G. This problem has a 
glorious history and an important place among combinatorial problems [2,5,8]. However 
few other matching problems share its nice combinatorial properties. If G = (U, E) is a 
graph, a matching M C E is maximal if for every e G E\ M, M U e is not a matching; 
V{M) = {v : u} G M}. Let /3(G) denote the minimum cardinality of a maximal 

matching in G. The minimum maximal matching problem is that of finding a maximal 
matching in G with /3(G) edges. The problem is NP-hard [10]. The size of any maximal 
matching is at most 2/3(G) [6] in general graphs and at most (2 — g) /3(G) [11] in 
regular graphs of degree d. Some negative results are known about the approximability 
of /3(G) [11]. 

In this paper we abandon the pessimistic point of view of worst-case algorithmic 
analysis by assuming that each input graph G occurs with a given probability. Nothing 
seems to be known about the most likely value of /3(G) or the effectiveness of any 
approximation heuristics in this setting. In Section 2 we prove that the most likely value 
of /3(G) can be estimated quite precisely, for instance, if G is chosen at random among 
all graphs with a given number of vertices. Similar results are proved in Section 3 for 
dense random bipartite graphs. Also, simple algorithms exist which, with high probability 
(w.h.p.), that is with probability approaching one as n = \V (G) | tends to inhnity, return 
matchings of size /3(G) + o(n). Lower bounds on f3{G), improving the ones presented 
above, are proved also in the case when higher probability is given to graphs with few 
edges. Most of the bounds on /3(G) are obtained by exploiting a simple relation between 
maximal matchings and independent sets. In Section 4 we investigate the possibility of 
applying a similar reasoning if G is a random d-regular graph. After showing a number 
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of lower bounds on /3(G) for several values of d, we present an algorithm that hnds a 
maximal matching in a c?-regular graph. We prove that with high probability it returns a 
matching of size asymptotically less than n/2 if G is a random cubic graph. 

In what follows Q(n,p) (G{Kn,n,p)) denotes the usual model of random (hipartite) 
graphs as dehned in [1]. Also G{n, d-reg) denotes the following model for random d- 
regular graphs [9, Section 4] . Let n urns be given, each containing d balls (with dn even): 
a set of dn/2 pairs of balls (called a configuration) is chosen at random among those 
containing neither pairs with two balls from the same urn nor couples of pairs with balls 
coming from just two urns. To get a random G G G(n, d-reg) let {i, j} G E{G) if and 
only if there is a pair with one ball belonging to urn i and the other belonging to urn j. If 
(/ is a random graph model, G G G means that G is selected with a probability defined 
by G- The random variable X = Xk{G) counts the number of maximal matchings of 
size k in G. The meaning of the sentences “almost always (a.a.)”, “for almost every (a.e.) 
graph” is dehned in [1, Ch. II]. 



2 General Random Graphs 



Let q= 1 — p. If [/ is a random indicator Pr[[7] will denote Pr[[7 = 1]. 
Theorem 1. IfG G G{n,p) then E(X) = ^ 



Proof. Let Mi be a set of k independent edges, assume that G is a random graph sampled 
according to the model G{n, p) and let X* ^ be the random indicator equal to one if Mi 

is a maximal matching in G. E(Xp = Pr[X® = p^q^ = ). Then by linearity of 
expectation 

E(^) = E|M,|=fcE(X;j = |{M, : |M,| = k}\-p\^"~") 

The number of matchings of size k is equal to the possible ways of choosing 2k vertices 
out of n times the number of ways of connecting them by k independent edges divided 
by the number of orderings of these chosen edges. □ 

A lower bound on /3(G) is obtained by bounding E(X) and then using the Markov 
inequality to prove that Pr[X > 0] approaches zero as the number of vertices in the 
graph becomes large. Assuming 2k = n — 2uj 



E(X) < 



( 2 ^) 



«)■ 



<(f)^ 



\ npqcj J ^ 



and this goes to zero only if w = f2{^/n). However a different argument gives a consid- 
erably better result. 

Theorem 2. /3(G) > f — iog(l/g) fa’’ ^ ^ G{n,p) with p constant. 



Proof. If M is a maximal matching in G then P \ P (M) is an independent set. Let 
Z = Zp^ 2 ui be the random variable counting independent sets of size 2w = in a 

random graph G. If X counts maximal matchings of size k = f — w, 
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Pt[X > 0] = Pt[X > 0 I Z > 0] Pt[Z > 0] + Pr[X > 0 | Z = 0] Pr[Z = 0] 

< Pt[X > 0 I Z > 0] Pt[Z > 0] + 0 • 1 < Pt[Z > 0] ^ 0 

The last result follows from a theorem in [4] on the independence number of dense 
random graphs. Thus /3(G) > f — iog\/g for a.e. G G Q{n,p). □ 

The argument before Theorem 2 is weak because even if E(Zp 2 (^) is small E(Jf) 
might be very large. The random graph G might have very few independent sets of size 
2u) but many maximal matchings of size § — w. 

Results in [4] also have algorithmic consequences. Grimmett and McDiarmid con- 
sidered the simple greedy heuristic which repeatedly places a vertex v in the independent 
set / if there is no u G / with {u, u} G E{G) and removes it from G. It is easily proved 

Theorems. /3(G) < f — 2 logfi/g) ^ ^ Qiji,p) with p constant. 

Proof. Let XS be an algorithm that first finds a maximal independent set / in G using 
the algorithm above and then looks for a perfect matching in the remaining graph. With 
probability approaching one |J| > (1 — S) iog°(y"g) for all <3 > 0. Also, XS does not 
expose any edge in G — Hence G — / is a completely random graph on about n — \I\ 
vertices, each edge in it being chosen with constant probability p. Results in [3] imply 
that a.a. such graphs contain a matching with at most one unmatched vertex. □ 

Independent sets are useful also for sparse graphs. If p = - a lower bound on /3(G) 
can be obtained again by studying a(G), the size of a largest independent set of vertices 
in G. 

Theorem 4. /3(G) > f — " for a.e. G G Q{n, c/n), with c > 2.27. 



Proof a(G) < for a.e. G G for c > 2.27 [1, Theorem XI.22]. The result 

follows by an argument similar to that of Theorem 2. □ 

If p = - for c sufficiently small, the exact expression for E(AT) in Theorem 1 gives 
an improved lower bound on /3(G). Roughly, if c is sufficiently small and {7 is a large 
independent set in G then the graph induced hy V \ U very rarely contains a perfect 
matching. 

Theorem 5. /3(G) > | for a.e. G G Gn,c/n, with c G (2.27, 16.99] 



Proof. Let fc = ^ If c? G (f , f ) then k < n/3. Hence k\ — and 



E(A) < 0(1) . 



n dn 

2 c 



- -\-d 



which goes to zero for every d in the given range. The best choice of d is the smallest 
and the theorem follows by noticing that | if c > 16.9989. □ 



3 Bipartite Graphs 

The results in the last section can be extended to the case when G G f/(7fn,n, p). Again 
/3(G) is closely related to a graph parameter whose value, at least in dense random 
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graphs, can be estimate rather well. Given a bipartite graph G = (Vi,V 2 ,E) with 
I 1 = I ^2 1 = n, a. split independent xet in G is a set of 2lo independent vertices S with 
[S' n Vi| = to. Let cr(G) be the size of a largest split independent set in G. If M is a 
maximal matching in a bipartite graph G then ^ \ F (M) is a split independent set. 

Theorem 6. IfG G G{Kn,n,p) then 

1. E(X) = 

2. If Z = Zp^n-k L the random variable counting split independent sets of size n — k 
andY = Ypk is the random variable counting perfect matchings in H G G{Kk^k,p) 
then E(X) = E(Z) • E(E). 

Proof Let Mi be a set of k independent edges and G G G(iKn^mP) and let ^ be the 
random indicator equal to one if Mi is a maximal matching in G. E(2f® = Pr[X* = 

pkq(n-k)\jYien 

E(X) = E|Md=fc m;,k) = \{M^ : m = k}\ • 

The number of matchings of size k is given by the possible ways of choosing k vertices 
out of n on each side times the number of permutations on k elements. □ 

If p is constant, it is fairly easy to bound the first two moments of Z and get good 
estimates on the value of ct(G). 

Theorem?. cr(G) ~ ^^^fora.e. G G G{Kn,n,p) with p constant. 

Proof. The expected number of split independent sets of size 2oj is ■ Hence, 

by the Markov inequality, and Stirling’s approximation to the factorial Vv\Z > 0] < 

( ^ ) right side tends to zero as n grows if 2u; = 2 ■ 

Let 2a; = 2 j for any e > 0. The event “Z = 0” is equivalent to 

“ct(G) < 2oj” because if there is no split independent set of size 2a; then the largest of 
such sets can only have less than 2a; elements. By the Chebyshev inequality Pr[Z = 0] < 
Var(Z)/E(Z)^. Also Var(Z) = E(Z^)—E(2')^. There ares^,; = (") ^ ways of choosing 
a; vertices from two disjoint sets of n vertices. If Z^ is the random indicator set to one 
if S'* is a split independent set in G then Z = ^ Z* and E(Z^) = ^ . - Pr[Z* A Z^] = 
^ Pr[Z^] Pr[Z* I Zf where the sums are over all i,j G s^j}. Finally 

by symmetry Pr[Z* | Zf does not actually depend on j but only on the amount of 
intersection between S* and Sf Thus, if S^ = {1, . . . , 2a;}, 

E(Z2) = (E,Pr[^^']) (E.Pr[Z* I Zi]) =E(Z) .E(Z|Zi). 



Thus to prove that Pr[Z = 0] converges to zero it is enough to show that the ratio 
E(Z|Z^)/E(Z) converges to one. By definition of conditional expectation 



nz\z') = G")0(rr;)(::aA-'' 



Define Tij (generic term in E(Z|Z^)/E(Z)) by 

T t /n\2 /w\ /w\ /n—io\ /n—ui\ —ij 
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Tedious algebraic manipulations prove that Tqq < 1 — , and, for 

sufficiently large n, Tij < forz + j = l,andTij < Tio for alii, j 

From these results it follows that 

Pr [a < 2 < Too + T,o + Toi + w^Tio - 1 

< 1 - 2c^yn + + ^ - 1 

— / (n— 02 + 1 )^ n n 

- u^(2uj-l) ^ 

— (n— 02 + 1)2 “r ^ 

□ 

Theorem 8. (3{G) > n — log for a.e. G G G{Kn^n,p) with p constant. 

The similarities between the properties of independent sets in random graphs and 
those of split independent sets in random bipartite graphs have some algorithmic impli- 
cations. A simple greedy heuristic almost always produces a solution whose cardinality 
can be predicted quite tightly. Let I be the independent set to be output. Consider the 
process that visits the vertices of a random bipartite graph G{Vi, V 2 , E) in some fixed 
order. If Vi = {v \, . . . , then the algorithm will look at the pair (vj,Vj) during step 
j . If {Vj , v^} ^ E and if there is no edge between u* and any of the vertices which are 
already in / then uj and Vj are inserted in I. Let crg{G) = |/|. 

Theorem 9. (Tg{G) ~ for a.e. G G G{Kn^n,p) with p constant. 

Proof. Suppose that 2(fc — 1) vertices are already in I. The algorithm above will add two 
vertices v\ and V 2 as the fcth pair if {^ 1 ,^ 2 } ^ E and there is no edge between either v\ 
or V 2 and any of the vertices which are already in L The two events are independent in the 
given model and their joint probability is (1 — p) • (1 — p)2(*-i) = (1 —p)^^~^. Let Wk 
(for k G IN"'”) be the random variable equal to the number of pairs considered before the 
fcth pair is added to I. Wk has geometric distribution with parameter Pk = {1 — 
Moreover the variables Wi, W 2 , ■ . . are all independent. Let Tj, = X)+i The 
event < n” is implied by “cTg(G) > 2uj”: if the split independent set returned by 
the greedy algorithm contains more than 2co vertices that means that the algorithm finds 
oj independent pairs in strictly less than n trials. Also if < n then certainly each of 
the Wk cannot be larger than n. Hence 

Pr[r^ <n]< Pr[n+i{ILfc < n}] = nLi{l - [1 - (1 - 

Let Lo = *' 2 ~|og given e > 0 and r G IN, choose m > r/e. For sufficiently 

large n, uj — m > 0. Hence 

Pr[F. <n]< nL.-„{l - [1 - (1 - p^'^+r} 

that is at most {1 — [1 — (1 _p) 2 (w-m)-i-ijn|m_ _ j.'^n > 1 _ we also have 

Pr[yjj < n] < {n(l — pj 2 (w-m)-i-i|m _ xjje event > n” is equivalent to 

“ag{G) < 2uj”. Let uj = ^^Tog i°/g" • If + > n then there must be at least one k for 
which Wk > njuj. Hence Pr[yjj > n < Pr[Uf^i{Wk > n/ijj}] this is at most 

ELi > «+] < w[l - (1 - P?"-^] • 
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By the choice of w, (1 — ^ ^ . Hence 



Pr[Kj > n]<LO 1 



(l_e)l L"/“J 



i-p 



< oj exp ■ 



n n \ 

1-P LwJ / 



Finally > n] < wexp| — — o(l)| since [n/wj > n/u — 1, and the 

result follows from the choice of uj. □ 

The greedy algorithm analysed in Theorem 9 does not expose any edge between 
two vertices that are not selected to be in I. Therefore G — / is a random graph. Clas- 
sical results ensure the existence of a perfect matching in G — and polynomial time 
algorithms exist which find one such a matching. We have proved the following. 



Theorem 10. /3(G) < n — a.c. G € withp constant. 



4 Regular Graphs 



In this section we look at the size of the smallest maximal matchings in random regular 
graphs. Again known upper bounds on the independence number of such graphs imply, 
in nearly all interesting cases, good lower bounds on /3(G). 

Theorem 11. For each d > 3 there exists a constant j{d) such that /3(G) > 'y{d)nfor 
a.e. G € G(n, d-reg). 



Proof. It is convenient to use the configuration model described in the introduction. Two 
pairs of balls in a configuration are independent if each ball is chosen from a distinct 
urn. A matching in a configuration is a set of independent pairs. The expected number 
of maximal matchings of size /c in a random configuration is 

n\ [2fc(d-l)]! (dn/2)\ r^d(n-2k) 

k\ {n—2k)\ [k{2d—l)—nd/2]\ {dn)\ 

If fc = jn, using Stirling’s approximation to the factorial, this is at most 



fh,d) 



U ' "i 


'd(l-2-if' 




'7(2d-l)-d/2' 






[7(2d-l)-d/2]^<=^'^-i) 


d 



For every d there exists a unique 71(d) G (^ 2(2d-i) ’ hj which f{'j,d) > 1, for 
7 G (71(d), 0.5). Since the probability that a random configuration corresponds to a 
d-regular graph is bounded (see for example [1, Chap 2]), the probability that a random 
d-regular graph has a maximal matching of size 771 is at most f{j, d)". If d > 6 a better 
bound is obtained by using 7(d) = (1 — o;3(d))/2 where aa(d) is the smallest value in 
(0, 1/2) such that a(G) < az{d)n for a.a. G G f/(n, d-reg) [7]. □ 

The relationship between independent sets and maximal matchings can be further 
exploited also in the case where G G f/(n, d-reg), but random regular graphs are rather 
sparse graphs and the approach used in the previous sections cannot be easily applied in 
this context. However, a simple greedy algorithm which finds a large independent in a 
d-regular graph can be modified and incorporated in a longer procedure that finds a small 
maximal matching in a random regular graph. Consider the following algorithm A. 
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Input: Random d-regular graph with n vertices 

(1) M^0; 

(2) while there is a vertex of degree d do 

choose V inV u.a.r. among the vertices of degree d\ 

M M U {v, Ml}; /* Assume N{v) = {ui, . . . , Mdeg^ «} 

(3) for j = 1 to c? — 1 do 

choose V u.a.r. among the vertices of degree d — j inV (M); 

v^\ W; 

(4) find a maximal matching M' in what is left of G; 

(5) make M U M' into a maximal matching for G. 



Step (2) essentially mimics one of the algorithms presented in [9] , with the only difference 
that instead of selecting an independent set of vertices, the process selects a set of edges. 
Step (4) can be clearly performed in polynomial time. In general the set M U M' is an 
edge dominating set (each edge in G is adjacent to some edge in M U M') but it is not 
necessarily a matching. However [10] any edge dominating set F can be transformed 
in polynomial time into a maximal matching M of G with \M\ < |F|. Let = {v : 
degg V = i}. In the remaining part of this section we will analyse the evolution of | 
for 0 < i < d, as the algorithm goes through step (2) and (3). Step (3) is performed in a 
number of iterations. For j > 0, let Vf (t) be the size of Di at stage t of iteration j, with 
the convention that iteration 0 refers to the execution of step (2). 

Step (2) for d-regular graphs. Theorem 4 in [9] implies that step (2) proceeds for 

2 

asymptotically Xi = ^ ^ ^ stages, adding an edge to M at every stage. 

Let = iDidV (M) I at stage t of iteration j and set V/~ (t) = Vf (t) — 

Let AV^ denote the expected change ofV^ (with sign G {“”, “+”, 

moving from stage f to f + 1, of step (2), conditioned to the history of the algorithm’s 
execution up to stage t. Let v be the chosen vertex of degree d. We assume a given 
fixed ordering among the vertices adjacent to v. The edge {n, Mi| is added to M and 
edges {m, ui} (for 1 = 2,..., degg v) are removed from G. Vertex v becomes of degree 
zero and the expected reduction in the number of vertices of degree i that are (not) in 

V{M) is (resp. )> that is the probability that a vertex in Di fl V{M) 

(resp. Di D {V \V (M))) is hit over d trials. The “loss” of a vertex of degree i implies 
the “gain” of a vertex of degree i — 1. Moreover if ui G Di+i fl (V \ V (M)) at stage t, 
then ui G Di nV (M) at stage f + 1. Let 6r,s = 1 if r = s and zero otherwise. In what 
follows i G {1, . . . , d — 1}. Also V^~ {f) = We have 



AV^{t) = -l-^^ 



= -^ + (1 - 






(z+l)(d-l)V;“|_-(t) 



nd—2dt 

it) 



fi+i)L°+T(*) 

nd—2dt 
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Setting X = tjn,Vl = nv^ we can consider the following system of 

differential equations: 

= -1 - ^ 



..0+V^t _ I /I X ' (i+l)r;°+i(a;) (x) 

'’i — l-2x V-L -r- d(l-2x) 

iv°-(x) (i+l)(d-l)v°~^(x) 



v» = 1 



w;''(0) = 0 



U— '/ \ 

Vi (x) = ^ 



d(l — 2x) 

,. 0 +/ 



_ 1 I (^) I ^1 (^) 



n°-(0) = 0 

n°(0) = 0 



In each case \Vl + 1) — is bounded by a constant. Also, the system 

of differential equations above is sufficiently well-behaved, so that the hypotheses of 



Theorem 1 in [9] are fulfilled and thus for large n, ~ nvl where 



1 sign/ 



Vi are the solutions of the system above. 

Lemma 1. For each d G IN'’" and for each i G {!,..., d}, there is a number Ai, two 
sequences of real numbers ^ d-i+i -^A Bf > and a number Ci 

such that the system of differential equation above admits the following unique solutions: 



(x) = Aj(l - 2x) + (1 - 2x) 2 



r d — i + l 1 

Q log(l - 2x) + X;]=o" Bl ox^ 



+{1-2x)'^Y.\J^^ Bl,x^ 

d—i 

-II 






(A)‘ 



v°(x) =fo(x)-fo(0) 



where fo(x) = X + (^^') / 



Proof. We sketch the proof of the first two results (which can be formally carried out by 
induction on d — i). For i = d, v^(x) = v^~(x) = — 2^(1 — 2x) + 23^(1 — 2x)‘^^^. 
Assuming the result holds for v°+i (x) and letting Di = {i + 1) (^ 2 ^), we have 

v-~{x) = DAI- 2x)i /; 

and the result follows by integration (in particular, the logarithmic terms are present only 
iff < 2). 

Let I°Ai{x) = fo then v°~(x) = A/°+i(a:). Therefore 



V, 



= (.- + 1 ) + ¥ A-i(») 

»??■(■) H- I 

0 /, o„^l+4 d-l 



= (* + 1) /; ^^±^ds 

•^0 (l- 2 s)i +2 
t—i—1 

- 1 



= (t+l) 

= (Z+1) 

\A) 



(A) 

(A)' 



:-2-i 



r 

Jo 



(l-2s)^+i 



-ds • 



d-l 



i—i—1 



- 1 



- 1 

dv^~ (x) 
d-l 



(i+l)(d-l) 



d-l 



d-l 
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The third result follows by replacing the expression for in 



ug'(a;) = 1 + 



■"i 

l-2x 



l-2x 



□ 



Lemma 2. Let x\ be the smallest root ofv^{x) = 0. After Step (2) is completed the size 
of M is asymptotically x\nfor a.e. G G Q{n, d-reg). □ 



Step (3.j)for cubic graphs. During this step the algorithm chooses a random vertex in 
n V (M) and removes it from G (all edges incident to it will not be added to M). 
Let Cj (3 — j) n/2 be the number of edges left at the beginning of iteration j . If iteration 
j — 1 ended at stage xjn, the parameter Cj satisfies the recurrence: 



_ Cj-i(4-j) _ 2(4-j)xj 
h - 3-j 3-j 



( 



1+3^ 



(cj-i 2xj) 



(with Co = 1) where x\ has been defined above and X 2 and X 3 will be defined later. For 
alH G {1, 2, 3} the expected decrease in the number of vertices of degree i in V{M) 

(resp. not in V{M)) is ( e^w- 2 t following set of equations describes the 

expected change in the various Vl In what follows i G {1, . . . , 3 — j}. Notice 

that = 0 for alH > 3 — j during iteration j so there are only 3 — j equations 

involving but there are always two involving Vf~ (t). 



vr(t) 



AVil W - 1 + Cjn-2t 

AVtit) = -53-y^- ^ 

Avrit) = + (1 - S 2 f) 

^ ^ ’ Cin—2t ^ Cin—2t 



Cjn—2t 
CAn—2t ' 



b + i ) Lyi ( t ) 

Cin— 2 t 



Leading to the d.e.’s 



w = i+£S + 









Vn (X) - i + G _2a; + 

{i+l)vlfiG) 

Cj—2x 

^vp(x) I ^ ^ 



,,i+ 



(-) = -<^3-,. - ^ + 



c^(0) = 0 

■■^'+(0) = v^-^^{xj) 






VI = Vf 

^ r ( o ) = 



Theorem 12. Let Xj be the smallest positive root ofv')^_^'^^{x) = O.forj G {1, 2, 3}. 
For a.e. G G Q{n, 3-reg) algorithm A returns a maximal matching of size at most 



Pu{G) ^n(xi + 



Proof. The result follows again by applying Theorem 1 in [9] to the random variables 
vl Notice that all functions v\ (x) have a simple expression which can be 

derived by direct integration and, in particular, 



X2 = f 



1 — exp 






X3 



£2 

2 






□ 
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5 Conclusions 

In this paper we presented a number of results about the minimal size of a maximal 
matching in several types of random graphs. If the graph G is dense, with high prob- 
ability (3(G) is concentrated around \ V{G)\/2 (both in the general and bipartite case). 
Moreover simple algorithms return an asymptotically optimal matching. We also gave 
simple combinatorial lower bounds on (3{G) if G G Q{n, c/n). Finally we presented 
combinatorial bounds on (3{G) if G G Q{n, d-reg) and an algorithm that finds a maximal 
matching of size asymptotically less than |y(G)|/2 in G. The complete analysis was 
presented for the case when G G Q{n, 3-reg). In such case the bound in Theorem 1 1 and 
the algorithmic result in Theorem 12 imply that 0.3158n < /3(G) < 0.47563n. Results 
similar to Theorem 12 can be proved for random d-regular graphs, although some extra 
care is needed to keep track of the evolving degree sequence. Our algorithmic results 
exploit a relationship between independent sets and maximal matchings. In all cases 
the given minimisation problem is reduced to a maximisation one, and the analysis is 
completed by exploiting a number of techniques available to deal with the maximisation 
problem. The weakness of our results for sparse graphs and for regular graphs leaves the 
open problem of finding a more direct approach which might produce better results. 
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Abstract. Given a set ^ = {Hi, H 2 , • • •} of connected non-acyclic graphs, a 
^-free graph is one which does not contain any member of ^ as induced subgraph. 
Our hrst purpose in this paper is to perform an investigation into the limiting 
distribution of labeled graphs and multigraphs (graphs with possible self-loops and 
multiple edges), with n vertices and approximately |n edges, in which all sparse 
connected components are ^-free. Next, we prove that for wy finite collection ^ of 
multicyclic graphs almost all connected graphs with n vertices and n + o(n^/3) 
edges are ^-free. The same result holds for multigraphs. 



1 Introduction 

We consider here labeled graphs, i.e., graphs with labeled vertices, undirected edges and 
without self-loops or multiple edges as well as labeled multigraphs which are labeled 
graphs with self-loops and/or multiple edges. A (n, q) graph (resp. multigraph) is one 
having n vertices and q edges. 

On one hand, classical papers, for e.g. [7], [8], [1 1] and [13], provide algorithms and 
analysis of algorithms that deal with random graphs or multigraphs generation, estimat- 
ing relevant characteristics of their evolution. Starting with an initially empty graph of n 
vertices, we enrich it by successively adding edges. As random graph evolves, it displays 
a phase transition similar to the typical phenomena observed with percolation process. 
On the other hand, various authors such as Wright [19], [21] or Bender, Canfield and 
McKay [3], [4] studied exact enumeration or asymptotic properties of labeled connected 
graphs. 

In recent years, a lot of research was performed for graphs without certain graphs 
as induced subgraphs. Let iT be a connected graph and let L" be a family of graphs 
none of which contains a subgraph isomorphic to 77. In this case, we say that the family 
F is H-free. Mostly forbidden subgraphs are triangle, ..., Cn, Kn, Kp^g graphs or any 
combination of them. We refer as bicyclic graphs all connected graphs with n vertices 
and (n -F 1) edges and in general {q + l)-cyclic graphs are connected (n, n + q) graphs. 
Also in this case, we say that it is a g-excess graph. In general, we refer as multicyclic 
a connected graph which is not acyclic. The same nomenclature holds for multigraphs. 
Denote by ^ = {Hi, H 2 , H 3 , ...} a set of connected multicyclic graphs. A ^-free graph 
is one which does not contain any member Hi of ^ as induced subgraph. Throughout this 
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paper, each Hi is a connected multicyclic graph. Our goal in this paper is; to extend the 
study of random (n, m{n)) graphs to random ^-free (n, m{n)) graphs when the number 
of edges, added one at time and at random, reach m{n) « l/2n and to compute the 
asymptotic number of ^-free connected graphs when ^ is finite. To do this, we will rely 
strongly on the result of [21], in particular, we will investigate ^-free connected (n, n+k) 
graphs when k = Note that similar works can be done with multigraphs. 

This paper is organized as follows. In Section 2, we recall some useful definitions 
of the stuff we will encounter throughout the rest of this document. In Section 3, we 
will work with the example of the enumeration of bicyclic graphs. The enumeration 
of these graphs was discovered, as far as we know, independently by Bagaev [1] and 
by Wright [19]. The purpose of this example is two-fold. First, it brings a simple new 
combinatorial point of view to the relationship between the generating functions of some 
integer partitions, on one hand, and graphs or multigraphs, on the other hand. Next, this 
example gives us ideas, regarding the simplest complex components, of what will happen 
if we force our graphs to contain some specific configurations (especially the form of 
the generating function). Section 4 is devoted to the computation of the probability of 
random graphs without isomorphs in the general case. In Section 5, we give asymptotic 
formula for the number of connected graph with n vertices, n+k edges as n — oo and 
fc — >■ oo but k = and prove that almost (n, n + o(n^/^)) connected graphs are 

^-free when ^ is finite. 



2 Definitions 

Powerful tools in all combinatorial approaches, generating functions will be used for our 
concern. If F{z) is a power series, we write [z"] F{z) for the coefficient of z” in F{z). 
We say that F{z) is the exponential generating function (EGF for brief) for a collection 
F of labeled objects if nl [z”j F{z) is the number of ways to attach objects in F that 
have n elements (see for instance [18] or [12]). The bivariate EGF for labeled rooted 
trees satisfies 



T{w,z) = zexp(T(w,z)) = V(wn)""^— , (1) 

nl 

n>0 

where the variable w is the variable for edges and z is the variable for vertices. Without 
ambiguity, one can also associate a given configuration of labeled graph or multigraph 
with its EGF. For instance, a triangle can be labeled in only one way. Thus, 

C 3 -)> Csiw, z) = ^w^z^ . (2) 

We will denote by Wk, resp. Wk, the EGF for labeled multicyclic connected multigraphs, 
resp. graphs, with k edges more than vertices. These EGF have been computed in [19] 
and in [13]. Furthermore, we will denote by Wk^n and Wk^n the EGF of multicyclic 
iT-free multigraphs and graphs with k edges more than vertices. In these notations, the 
second indice corresponds to the forbidden configuration(s). Recall that a smooth graph 
or multigraph is one with all vertices of degree > 2 (see [20]). Throughout the rest of this 




30 



V. Ravelomanana and L. Thimonier 



paper, the “widehaf’ notation will be used for EGF of graphs and “underline” notation 
corresponds to the smoothness of the species. For example, Wk, resp. Wk are FGF for 
respectively connected (n, n + k) smooth graphs and smooth multigraphs. 



3 The Link between the EGF of Bicyclic Graphs and Integer 
Partitions 



After the different proofs for trees (see [14] and [9]), Renyi [16] found the formula to 
enumerate unicyclic graphs which can be expressed in terms of the generating function 
of rooted labeled trees 







1 

1 - T{z) 



T{z) 

2 



T{zr 

4 



(3) 



It may be noted that in some connected graphs, as well as multigraphs, the number of 
edges exceeding the number of vertices can be seen as useful enumerating parameter. 
The term bicyclic graphs, appeared first in the seminal paper of Flajolet et al. [11] 
followed few years later by the huge one of Janson et al. [13] and was concerned with 
all connected graphs with (n + 1) edges and n vertices. Wright [19] found recurrent 
formula well adapted for formal calculation to compute the number of all connected 
graphs with k edges more than their proper number of vertices for general k. Our aim in 
this section is to show that the problem of the enumeration of bicyclic graphs can also 
be solved with techniques involving integer partitions. 

There exist two types of graphs which are connected and have (n + 1) edges as 
shown by the figures below. 





Fig. 1. Examples of bicyclic components Fig. 2. Smooth bicyclic components without 

symmetry 



Wright [19] showed with his reduction method that the FGF of all multicyclic graphs, 
namely bicyclic graphs, can be expressed in term of the FGF of labeled rooted trees. 
In order to count the number of ways to label a graph, we can repeatedly prune it by 
suppressing recursively any vertex of degree 1. We then remove as many vertices as 
edges. As these structures present many symmetries, our experiences suggest so far that 
we ought to look at our previously described object without symmetry and without the 
possible rooted subtrees. 
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There are 




n — p 

q 



(p-1)! (g-l)! 
2^2 



q{n-p-q)\ 



^ ways to label the graph repre- 



sented by the figure 2a (p ^ q) and 2( ^ r ^ ways for the graph of the 

figure 2b. Note that the results are independent from the size of the subcycles. One can 
obtain all smooth bicyclic graphs after considering possible symmetry criterions. In 2a, 
if the subcycles have the same length, p = q, a factor 1/2 must be considered and we 
have nl/8 ways to label the graph. Similarly, the graph of 2b can have the 3 arcs with the 
same number of vertices. In this case, a factor 1/6 is introduced. If only two arcs have 
the same number of vertices, we need a symmetrical factor 1 /2. Thus, the enumeration 
of smooth bicyclic graphs can be viewed as specific problem of integer partitioning into 
2 or 3 parts following the dictates of the basic graphs of the figure 3. 



(a) (b) (c) 



(d) (e) 




(f) (g) 



Fig. 3. The different basic smooth bicyclic graphs 



With the same notations as in [6], denote by Pi{t), respectively Qi{t), the generating 
functions of the number of partitions of an integer in i parts, respectively in i different 
parts. Let Wi (z) be the univariate EGF for smooth bicyclic graphs, then we have Wi (t) = 
f{P 2 {t), Pz{t), Q 2 {t),Qa{t))- A bit of algebra leads to 



Wp{z) 



z4 (6-z) 



(4) 



In this formula, the denominator denotes the fact that there is at most 3 arcs or 3 

degrees of liberty of integer partitions of the vertices in a bicyclic graph. The same remark 
holds for the denominators in Wright’s formulae [19], for all (fc -F l)-cyclic 

connected labeled graphs. The EGF of labeled rooted trees, T{z), is introduced here 
when re-expanding the reduced vertices of some smooth graph. The main consequence 
of the relation between integer partitions and these EGF is that in any bicyclic graphs 
containing an induced q-gon as subgraph, the EGF is of the form 
The form of these EGF is important for the study of the asymptotic behaviour of random 
graphs or multigraphs. The key point of the study of their characteristics is the analytical 
properties of tree polynomial tn{y) defined as follow 



n\ 

n>0 






( 5 ) 
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where is a polynomial of degree n in y. Knuth and Pittel [10] studied their prop- 
erties. For fixed y and n — >■ oo, we have 



in{y) 



^^(n-lt2+yt2) 

2y/^r{yi2) 






This equation tells us that in the EGF of hicyclic graphs 



( 6 ) 



FFi(z) = 



T{zY (6-T(z)) 



24 



1 



19 



1 



(7) 



(1-T(z))3 24 (1 -T(z))3 24 (1 -T(z))2 

only the coefficient ^ of f„(3) is asymptotically significant. Thus in [13, Theorem 
5], the authors proved that only leading coefficients of f„(3fc) are used to compute 
the probability of random graphs or multigraphs. As already said, these coefficients 
change only slightly in the study of random graphs or multigraphs without forbidden 
configurations. Denote respectively by Vq^ and Vq^ the EGF for acyclic multigraphs 
and graphs without triangle (Cs), we have 






6 



( 8 ) 



and 






1 



T{z) T{zf T{zf 



(9) 

2 1 - T{z) 2 4 6 

For hicyclic components without triangle, we have respectively for multigraphs and 
graphs 






T{z) (3 + 2r(z)) 
24 (1-T(z))3 



and Wi^Csiz) 



T{zf (2-f 6T(z) -3 T(z) 2) 
^4 (1-T(z))3 



( 10 ) 

The decompositions of formulae (10), using the tree polynomials described by (5), lead 
respectively to 



K Y 1 2" 

Wi^Caiz) = ^ 

n>0 

1^1, C3(^) = X)n>0 ~ ~ if ~ (-J2) 

+ fftn(-2) - |f„(-3) -f |in(-4))|j • 

Lemma 1 . If ^ = {Ck, k G 17} where 17 is a finite set of integers greater to or less 
than 3, the probability that a random graph or multigraph with has n vertices and 1 /2n 
edges only acyclic, unicyclic, hicyclic components all C^-free, k G 12, is 

Q-T.k€n fk + . (13) 

□ 
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Proof. This is a corollary of [13, eq (11.7)] using the formulae (8), (9), (11) and (12). 
Incidentally, random graphs and multigraphs have the same asymptotic behavior as 
shown by the proof of [13, Theorem 4]. As multigraphs are graphs without cycles of 
length 1 and 2, the forbidden cycles of length 1 and 2 bring a factor which 

is cancelled by a factor because of the ratio between weighting functions that 

convert the EGF of graphs and multigraphs into probabilities 




The situation changes radically when cycles of length greater to or less than 3 are 
forbidden. Equations (8), (9) and the “significant coefficient” ^ of f„(3) in (1 1) and in 
(12) and the demonstration of [13, Lemma 3] show us that the term introduced 

in (8) and (9) for each forbidden fc-gon, simply changes the result by a factor ofe“^/^^ + 



The example of forbidden k-gon suggests itself for a generalization. 



4 Random Isomorphism-Free Graphs 



The probabilistic results on random H-fme graphs/multigraphs can be obtained when 
looking at the form of the decompositions of their EGF into tree polynomials. 

Lemma 2. Let H be a connected {n,n + p) graph or multigraph.Let Rq u(w,z), resp. 
Rq,H, be the bivariate EGF of all connected q-excess graphs, resp. multigraphs, con- 
taining at least one subgraph isomorphic to H. Then, z) is of the form 



Rq^H{w,z) = ' 



, nnwz)) 

(1 — T{wz)Y 



where k < 3q and P is a polynomial. Similar formula holds for multigraphs. 



(15) 

□ 



Proof. The more cycles H has, the more the degree of the denominator of the EGF 
multicyclic graphs or multigraphs containing subgraph isomorphic to H diminishes. 
This follows from the fact that EGF of {q + l)-cyclic graphs or multigraphs are simply 
combination of integer partitions functions up to 3g. If we force our structures to contain 
some specific multicyclic subgraphs, some parts are fixed and we strictly diminish the 
number of parts of integers needed to reconstruct our graphs or multigraphs. 



Lemma 3. Let H be a connected multicyclic graph or multigraph with n vertices and 
(n + p) edges withp > 0 and let Wq^H, respectively Wq^n, be the generating functions 
of connected multicyclic H -free multigraphs, respectively graphs, with q edges, (q> p), 
more than vertices. lfWq{z) = + Si>i Wright’s EGF 

of q-excess multigraphs rewritten with tree polynomials then Wq^H{z) = 
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Si>i these formulae the leading coefficient Cq is the same for Wq and 

Wq, defined in [13, equation (8.6)]. Analogous result holds for multigraphs with the 
EGF Wq and Wq^n- □ 

Proof. One can write Wq{z) = where Pq is a polynomial. Then, we can 

express Pq{x) in term of successive powers of (1 — x)* and Cq equals simply Pq{l). 
We have Wq^H{w,z) = + w’^J2i<3qditn{i)"‘^ because Wq{w,z) = 

Wq^H{w, z) + Rq^niw, z), where Rq^niw, z) is the bivariate EGF of multicyclic con- 
nected graphs with q edges more than vertices. As shown by (15), the denominator of 
Rq.niw, z) is strictly less than 3q. 

We are now ready to state the following result. 

Theorem 1. Let ^ = {Hi, H2, H3, ...Hm} be a finite collection of multicyclic con- 
nected graphs or multigraphs. Then the probability that a random graph with n vertices 
and ^n+0{n~3) edges hasri bicy die components, tricyclic components , .... (/c-l-1)- 
cyclic components, all components {Hi, H2, H3, ...Hm}-free and no components of 
higher cyclic order is 





rp- 



rfe! (2r)! 



+ 0{n~^/^) 



where [2 = {p > 3,3i € [1, m] such that Hi is a p-gon}. 



(16) 

□ 



The theorem 2 below shows that a necessary and sufficient condition to change a coeffi- 
cient Ci of (16) is that ^ must contain all graphs contractible to a certain i-excess graph 
H,. 



Theorem 2. Let H be a k-excess multicyclic graph (resp. multigraph) with k > 0. 
Suppose that H has n vertices, n -\- k edges and c{H) n\ is the number of ways to 
label H (for example c(iT 4 ) = 1/24 ). Denote by ^k(H) the set of all k-excess graphs 
contractible to H. Then the probability that a random graph (resp. multigraph) with 
n vertices and m(n) = l/2n -f 0{n~^^^) edges has ri bicyclic, T2 tricyclic, ..., Vp 
p -\- 1-cyclic components, all without component isomorphic to any member of the set 
^k{H) is 



(^Y (cfc - 4+1" 

^ 3 ' V 3 ri! ra! rfc_i! r^! rfc+i! 



r! 



rp\ (2r)! 






□ 



Proof. The EGF of ^k{H) is simply 

= (18) 

Thus in (16) if we want to avoid all graphs contractible to H, we have to substract (18) 
to the EGF of connected fc-excess graphs. Lemma 3 shows us that the other coefficients, 
i.e., Ci for alH > fc remain unchanged. 
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5 Asymptotic Numbers 

Denote by c(n, n+k) the number of connected (n, n+k) graphs. Similarly, let cj(n, n+ 
k) be the number of connected (n, n + k) ^-free graphs. Wright [21, Theorem 4] state 
the following result: 

Theorem 3. Ifk = o{n^^^), but k ^ oo as n ^ oo then 
c{n,n+k) = d(37r)i/2(e/12fc)'=/2n"+i/2(3fe-i)(i + c>(fc-i) + o(fc3/V«^^^)) • (19) 

□ 



Note that later Voblyi [17] proved c?/c — >■ ^ as A: — >■ oo. We prove here that a very 
similar result holds for c^(n, n + k), i.e., for ^-free connected graphs when ^ is finite 
and k = o(n^/^). 

If X{z) and Y{z) are 2 EGF, we note here that W > F iff Vn, [z”j X(z) > [z”j Y{z). 
Denote by ITfc, k > Othesetof fc-excessgraphsandITfc(w, 2 ;) theirbivariateexponential 
generating function. Thus, Wk{w, z),k > 1 has the following form 



Wk{w,z) = 



(1 — T{wz)Y^ (1 — T{wz)Y^ 1 



and Wo(ru, z) = V {w, z) as in eq. (3). 

Furthermore, denote by the set of connected ^-excess ^-free graphs. 
Lemma 4. If^ is finite, Wk,^(w, z),k > 0 has the following form 



{ck + 0 
(1 — T{wz 



CTfc) 

2))3fc-l 



where ak = 0if^ does not contain a p-gon. □ 

Proof. Denote respectively by Sk,^ and j the FGF for k-excess graphs containing 
exactly one occurrence of a member of ^ and k-excess with many occurrences of member 
of ^ but necessarily juxtaposed, i.e. the deletion of one edge will delete any occurrences 
of any member of For example if G 3 and C 4 are in a "‘house” is a juxtaposition of 
them. Then, remember that Wk,^ satisfy the recurrence (see also [15]) 

VwWk+i,^ + 0(S'fe-i-i,{) + 0{Jk+i,f) = 2 ~ 

( 22 ) 

Z^p+q=k t+Sp,g 

where Vj, = x-^ (see [12]). Lemma 4 follows from the fact that 5'^ j and ^ are of 
the form described by lemma 2. We have, for A: > 0 the formula for smooth (n, n -F A;) 
graphs with exactly one triangle 

(vjjffaifi + F ^ (23) 

^ ^ i<3k-2 ^ ^ 
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Thus, 



Sk,C:^{z) ^ {k 1) ^ E_^ (1 _ t\z)Y 



(24) 



and in this case a^. of (21) equals |(fc — l)bk-i- 



Lemma 5. 

bk (cfc + cxk) N bk 

(1 -T(z))3fc “ (1 - T(z))3'=-1 - - (1-T(z))3fc 



(25) 

□ 



Proof. We have to prove only + {i-T(z)pk-i > 0> since Wright 

[21] show Wk{z) < and a fortiori, we have Vhfc,^( 2 :) < (i_rfz)) 3 fc ■ Substi- 

tuting (21) in (22) leads to 



fc-i 

2(A: + l)6fc+i = 3k{k + l)bk + 3 ^ i(A: - t)6t6fc_f (26) 



and 



2(3fc + 2)(cfe+i + afc+i) — 8(fc + l)6fc+i + 3kbk + (3/c + 2)(3fc — l)(cfe + ak) 



+ t(3k — 2>t — l)bt{ck-t + ctk-t) 

(27) 

Then we have also as in [21, Lemma 5], kbk < Ck + ak < ° kbk- Still using 
similar arguments to those of [21], the equivalent of [21, Lemmas 6, 7, 8, 9, 10] can 
also be obtained here (with the coefficients Ck + ak instead of Ck) to prove lemma 5 by 
induction on k. 



Theorem 4. Given a finite collection ^ of multicyclic graphs, almost all connected 
(n, n + k) graphs are ^-free when k = o{n^^Y- ITI 

Proof. By lemma 5, [21, eq (5.2), (5.3), Theorem 2] and (26). 
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Abstract. Random regular graphs are, at least theoretically, popular 
communication networks. The reason for this is that they combine low 
(that is constant) degree with good expansion properties crucial for effi- 
cient communication and load balancing. When any kind of communica- 
tion network gets large one is faced with the question of fault tolerance 
of this network. Here we consider the question: Are the expansion prop- 
erties of random regular graphs preserved when each edge gets faulty 
independently with a given fault probability? We improve previous re- 
sults on this problem: Expansion properties are shown to be preserved 
for much higher fault probabilities and lower degrees than was known 
before. Our proofs are much simpler than related proofs in this area. 



Introduction 

A natural question in the theory of fault tolerance of communication networks 
reads: Is it possible to simulate the non-faulty network on the faulty one with a 
well determined slowdown? Here one assumes that the network proceeds in syn- 
chronous steps and in each step each processor (= node of the network) performs 
some local computation and some communication steps. Ideally one would like to 
simulate the non-faulty network in such a way that the simulation is slower only 
by a constant factor showing that the time is essentially unchanged. Whereas 
such efficient simulations are known for networks with unbounded degree, like 
the hypercube, it is still an important question whether they exist for bounded 
degree networks like the butterfly [3] . Note that all of this paper refers to random 
faults, that is each component (normally edge or node) gets faulty independently 
with a given fault probability and the results only hold with high probability 
meaning with probability going to 1 when the network gets large. 

Random regular graphs with given degree d > 3 are well known to be ex- 
pander graphs (with high probability) [2]: There is a constant C {< 1) such 
that each subset X of nodes has > C ■ |A| neighbours adjacent to X but not 
belonging to X (provided X contains at most half of all vertices) . If we ever were 
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obtained through Alasdair Urquhart. 
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to simulate computation on a random regular graph with slowdown only by a 
constant factor on the faulty graph we would need a linear size expander inside 
the faulty graph. 

The investigation of random regular graphs with edge faults starts with the 
paper [9]. In the succeeding paper [10] attention is drawn to the preservation 
of expansion properties. Some sufficient conditions are given. In work by the 
first author [4] a threshold result for the existence of a linear size component is 
proved. In [5] we give a sufficient condition on fault probability and degree such 
that we can find a linear size expander efficiently - a question not treated in the 
initial work on expansion [10]. Crucial to our result is the notion of a fc-core: 
The fc-core of a given graph is the (unique) maximal subgraph where each node 
has degree at least k. In [5] we first observe that the 3— core of a faulty random 
regular graph is an expander (this follows simply from randomness properties 
of the 3— core). Second, we present a simple edge deletion algorithm which is 
shown to find a 3— core of linear size when d > 42 and each edge is non-faulty 
with probability at least 20 /d. 

The present paper improves considerably on these results: We give a precise 
threshold on the fault probability for the existence of a linear size k-core for 
any d > k > 2>. Thus improving the previous bounds for the existence of 
an expanding subgraph. For example when the degree is as low as 4 and each 
edge is faulty with probability < 1/9 we have a linear size 3— core and thus an 
expanding subgraph. 

Our proof uses a proof technique originally developed for [7]. It is technically 
quite simple. This is in sharp constrast to the previous proofs of the weaker 
results mentioned above relying on technically advanced probability theoretic 
tools. This technique applies to a wide range of similar problems (see [8]). The 
technique was inspired by the original (more involved) proof of the fc-core thresh- 
old for Gn,p given in [6]. 



1 Outline 

We will study random regular graphs with edge faults by focussing on the con- 
figuration model (cf. [1]). It is well known that properties which hold a.s. (almost 
surely) for a uniformly random d-regular configuration also hold a.s. for a uni- 
formly random d-regular simple graph. For the configuration model, we consider 
n disjoint d— element sets called classes; the elements of these classes are called 
copies. A configuration is a partition of the set of all copies into 2— element 
sets, which are edges. Identifying classes with vertices, configurations determine 
multigraphs and standard graph theoretic terminology can be applied to configu- 
rations. More details can be found in [1]. We fix the degree d and the probability 
p for the rest of this paper and consider probability spaces Con(n, d, p) of ran- 
dom configurations where each edge is present with probability p or absent with 
fault probability / = 1 — p. We call this space the space of faulty configurations. 
An element of this space of is best considered as being generated by the following 
probabilistic experiment consisting of two stages: 
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(1) Draw randomly a configuration <P = (W, E) where W = Wi U . . . U Wn 
and \ Wi\ = d. (2) Delete each edge of along with its end-copies, independently 
with fault probability /. 

The probability of a fixed faulty configuration with k edges is (n ■ d — 2 • 
fc)!! • (1 — ■ p^. Given k, each set of k edges is equally likely to occur. 

The degree of a class W with respect to a faulty configuration Deg^(TT), is 
the number of copies of W which were not deleted. Note that edges {x , y} with 
X ,y GW contribute with two to the degree. The degree of a copy x, Deg<i>(x), is 
the degree of the class to which x belongs. The /c-core of a faulty configuration 
is the maximal subconfiguration of the faulty configuration in which each class 
has a degree > k. We call classes of degree less than k light whereas classes of 
degree at least k are heavy. By Bin{m, A) we denote the binomial distribution 
with parameters m and success probability A. 

We now give an overview of the proof of the following theorem which is the 
main result of this paper. For d > fc > 3 we consider the real valued function 
T(A) = X/ Pr[Bin{d—l, A) > k—1] which we define for 0 < A < 1. L(l) = 1 and 
L(A) goes to infinity for A approaching 0. Moreover L(A) has a unique minimum 
for 1 > A > 0. Let r{k, d) = min{L(A)|l > lambda > 0}. For example we 
have that r(3, 4) = 8/9. The definition of r{k,d) is, no doubt, mysterious at 
this point, but we will see that it has a very natural motivation. 

Theorem 1. (a) If p > r{k,d) then a random <P G Con{n, d, p) has a k-core 
of linear size with high probability. 

(b) If p < r{k,d) then a random <P G Con(ji, d, p) has only the empty k-core 
with high probability. 

Theorem 1 implies that the analogous result holds for the space of faulty 
random regular graphs (obtained as: first draw a graph, second delete the faulty 
edges). The following algorithm which can easily be executed in the faulty net- 
work itself is at the heart of our argument. 

Algorithm 2 The Global Algorithm 

Input: A faulty configuration <P, output: The fc-core of <1. 

while ‘P has light classes do 

P := the modification of P where all light classes are deleted, 
od. Output P. 

Specifically, when we delete a class, W, we delete (i) all copies within W, (ii) 
all copies of other classes which are paired with copies of W, (iii) W itself. Note 
that it is possible for W itself to still be undeleted but to contain no copies as 
they were all deleted as a result of neighbouring classes being deleted, or faulty 
edges. In this case, of course, W is light and so it will be deleted on the next 
iteration. At the end of the algorithm P has only classes of degree > fc, which 
form the k-core. The following notion will be used later on: A class W of the 
faulty configuration P survives j (j > 0) rounds of the global algorithm with 
degree t iff IF has not yet been deleted and has degree t after the j’th execution 
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of the while-loop of the algorithm with input A class simply survives if it has 
not yet been deleted. 

In section 2 we analyze this algorithm when run for j — 1 executions of 
the loop where we set j = j{n) = \/log^ n throughout. We prove that the 
number of classes surviving j — I rounds with degree t > k is linear in n with 
high probability when p > r(k,d) whereas the number of light classes is o(n). 
(Initially this number is linear in n.) An extra argument presented in section 4 
will show how to get rid of these few light classes leaving us with a linear size 
fc-core provided p > r{k,d). On the other hand, if p < r{k,d) then we show 
that the expected number of classes surviving j — 1 rounds with any degree is 
o(n) and that we have no longer enough classes to form a fc-core. This is shown 
in section 3. 



2 Reduction of the Number of Light Classes 

For d > t > 0, and for a particular integer j, we let 

Xt : Con(n, d, p) — >■ Af (1) 

be the number of classes surviving j — 1 rounds of the global algorithm with 
degree equal to t. As usual we can represent X = A* as a sum of indicator 
random variables 



X = Xwi -I- • • • -I- Xw„ , (2) 

where X\v assumes the value 1 when the class W survives j — 1 rounds with 
degree equal to t and 0 when this is not the case. Then EX = n ■ E[Xw] = 
n ■ Pr\W survives with degree t] for W arbitrary. We determine 
Pr\W survives with degree t] approximately, that is an interval of width o(l) 
which includes the probability. The probability of the event: W survives j — 
1 rounds with degree t, turns out to depend only on the j— environment of 
W defined as: For a class W the j— environment of W, j — Env,g(IF), is that 
subconfiguration of <P which has as classes the classes whose distance from W 
is at most j. Here distance means the number of edges in a shortest path. The 
edges of j — Env,f(VF) are those induced from <d>. 

The proof of the following lemma follows with standard conditioning tech- 
niques observing that the j— environment of a class FF in a random configuration 
can be generated by a natural probabilistic breadth first generation process (cf. 
[4] for details on this.) Here it is important that j only slowly goes to infinity. 

Lemma 1. Let W be a fixed class then Pr{j — Env<p{W) is a tree} > 1 — o(l). 

Note that the lemma does not mean: Almost always the j-environment of all 
classes is a tree. The definition of j-environment extends to faulty configurations 
in the obvious manner. Focussing on a j-environment which is a tree is very 
convenient since in a faulty configuration, it can be thought of as a branching 
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process whereby the number of children of the root is distributed as Bin{d,p), 
and the number of children of each non-root as Bin{d — l,p). 

The following algorithm approximates the effect the global algorithm has on 
a fixed class W, provided the j— environment of is a tree. 

Algorithm 3 The Local Algorithm. 

Input: A (sub-)configuration B, which is a j— environment of a class IT in a 
faulty configuration. T is a tree with root W. 

<P := r 

for i = j — 1 downto 0 do 

Modify <P as follows: Delete all light classes in depth i of the tree <P. 
od. 

The output is “W survives with degree t” if W is not deleted and has final degree 
t. If W is deleted then the output is “IT does not survive”. 

Note that it is not possible for IT to survive with degree less than k. By 
round I of the algorithm we mean an execution of the loop with i = j — I where 
1 < Z < j. A class in depth i where 0 < i < j survives with degree t iff it is not 
deleted and has degree t after round j — i oi the algorithm. Note that classes 
in depth j are never deleted and so they are considered to always survive. The 
next lemma states in which respect the local algorithm approximates the global 
one. The straightforward formal proof is omitted in this abridged version. 

Lemma 2. Let j > 1. For each class IT and each faulty configuration where 
j — Env,p{W) is a tree we have: After j — 1 rounds of the global algorithm with 
<P the class IT survives with degree t > k. After running the local algorithm 
with j — Env,p{W) the class W survives with degree t > k. 

Note that IT either survives j — 1 rounds of the global algorithm and the 
whole local algorithm with the same degree t > k or does not survive the local 
algorithm in which case it does or does not survive j — 1 global rounds, but does 
certainly not survive j global rounds. 

We condition the following considerations on the almost sure event that for 
j = j{n) the J— environment of the class IT in the underlying fault free config- 
uration is a tree (cf. Lemma 1). We denote this environment in a random faulty 
configuration by E. We turn our attention to the calculation of the survival 
probability with the local algorithm. 

For i with 0 < i < j — 1 let be the probability that a class in level (=depth) 
j — i of E survives the local algorithm applied to E. As the j-enviroment in the 
underlying fault-free configuration is a tree, the survival events of the children 
of given class are independent. Therefore: 

4>o = 1 and = Pr[Bin{d — 1, p ■ 4>i-i) > fc — 1]. (3) 

And furthermore, considering now the root IT of the j-environment, we get for 
t > k hy analogous considerations: 

Pr[W survives the local algorithm with degree t.] = Pr[Bin{d, p ■ fij-i) = t]. 
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We have that the sequence of the 4>iS is monotonically decreasing and in the 
interval [0, 1]. Hence (j) = 4>{p) = limi_>oo <t>i is well defined and as all functions 
involved are continuous we get: </> = Pr[Bin{d — ■ 4>) > k — V\. (Note that 

this is no definition of (f>, the equation is always satisfied by (f> = 0.) 

Two further notations for subsequent usage: Xty = Xty{p) = Pr[Bin{d, p- 
= t] for i > 1. Again we have that the Xt^s are monotonically decreasing 
and between 0 and 1 and At = Xt{p) = limt_>oo At,i- exists. Hence for our fixed 
class W, considering j — >• oo, we get: 

Pr[W survives the local algorithm with degree t.] = Xtj = Xt + o(l). (4) 
Here is where our formula for r{k,d) comes from: 

Lemma 3. </> > 0 ijfp > r{k,d). 

Proof. First let </> > 0. As stated above we have (f = Pr[Bin{d— > k — 1]. 
Therefore Pr[Bin{d — l,p(j)) > fc — 1] > 0 and setting X = p ■ <j>, we get X/p = 
Pr[Bin{d — 1, A) > fc — 1] and so p = X/ Pr{Bin{d — 1, A) > fc — 1) = L(A) and 
the result follows. 

Now let p > r{k,d). Let Aq be such that r{k, d) = L{Xo). We show by 
induction on i that p ■ 4>i > Aq. For the induction base we get: p ■ (j>o = p > 
r{k, d) > Aq where the last estimate holds because the denominator in the 
definition of L(Ao) always is < 1. For the induction step we get: 
p-(j)i+i = p- Pr[Bin{d— 1, p- (fi) > k—l]>p-Pr[Bin{d—l,Xo) > fc— l]>Ao 
where the last but one estimate uses the induction hypothesis and the last one 
follows from the assumption. □ 

We now return to the analysis of the global algorithm. The next corollary 
follows directly with Lemma 1, Lemma 2, and (4). 

Corollary 1. Let W be a fixed class, t > k and let j = j{n) = \/log^ n. In 
the space of faulty configurations we have (cf.(2)): 

Pr[Xw = 1] = Pr{W survives j{n) — 1 global rounds with degree t\ 

= Xt + o(l). 

Next the announced concentration result: 

Theorem 4. Let t > k, X = Xt be the random variable defined as in (1 ), and 
let X = Xt; then we have: 

(1) EX = X - n + o(n). (2) Almost surely |A — A • n| < o{n). 

Proof. (1) The claim follows from the representation of A as a sum of indicator 
random variables (cf. (2)) and with Corollary 1. 

(2) We show that VX = o(n^). This implies the claim with an application 

of Tschebycheff’s inequality. We have X = X^^ + Xyy.^ + . . . + Xw„ (cf. 
(2)). This and (1) of the present theorem implies VX = E[X"^] — (EX)"^ = 
if [A^] — (A^ • + o(n) -n). Moreover, if [A^] = EX + n-{n—l)-E[Xu-Xw] = 

X-n + o{n) + n- (n — 1) • E[Xu ■ Aw], where U and W are two arbitrary distinct 
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classes. We need to show that ElX"^] = + o(n^). This follows from 

E[Xu ■ Xw] = y? + o(l) showing that the events Xjj = 1 and Xw = 1 are 
asymptotically independent. This follows by conditioning on the event that the 
j— environments of U and W are disjoint trees and analyzing the breadth first 
generation procedure for the j— environment of a given class. Again we need 
that j goes only slowly to infinity. □ 



3 When There is no fc-Core 

The proof of Theorem 1(b) is now quite simple. First we need the following fact: 



Lemma 4. A.s. a random member of Con{n,d,p) has no k-core on o(n) ver- 
tices. 



Proof. The lemma follows from the fact that a random member of Con(n, d) a.s. 
has no subconfiguration with average degree at least 3 on at most en vertices, 
where e = e{d) is a small positive constant. Consider any s < en. The number of 
choices for s classes, 1.5s edges from amongst those classes, and copies for the 
endpoint of each edge, is at most: 








Setting M{t) = t!/(2‘/^(t/2)!) to be the number of ways of pairing t copies, 
we have that for any such collection, the probability that those pairs lie in our 
random member of Con(n, d) is 



M{{d-is)n)/M{dn) < 

n 



Therefore, the expected number of such subconfigurations is at most: 




( ( 2 ) ^ /^) ^1.5s^ ^ ^1.5s 

\1.5sJ ^ s 1.5s 



<( 



n 



f{s). 



Therefore, if e = 1 /40d® then the expected number of such subconfigurations is 
less than Y ^^=2 /(■®) which is easily verified to be o(l). □ 



Now, by Lemma 3, we have for p < r{k,d) that ^ = 0. Therefore, as j goes to 
infinity the expected number of classes surviving j rounds with degree at least 
k is o(n) and so almost surely is o(n). With the last lemma we get Theorem 1 
(b). 
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4 When There is a fc-Core 

In this section, we prove Theorem 1(a). So we assume that p > r{k, d). We start 
by showing that almost surely very few light clauses survive the first j{n) — 1 
iterations: 

Lemma 5. In Con{n, d, p) almost surely: The number of light classes after 
j{n) — 1 = i/log^; n — 1 rounds of the global algorithm is reduced to o{n). 

Proof. The proof follows with Theorem 4 applied to j — 2 and j — 1 (which both 
go to infinity). □ 

In order to eliminate the light classes still present after j(n) — 1 global rounds, 
we need to know something about the distribution of the configurations after 
j(n) — 1 rounds. As usual in similar situations the uniform distribution needs to 
be preserved. For n = (no, ni, U 2 , ■ ■ ■ ,nd) where the sum of the is at most 
n we let Con(h) be the space of all configurations with classes consisting of 
i copies. Each configuration is equally likely. The following lemma is proved in 
[5]. 

Lemma 6. Conditioning the space Con{n ,d ,p) on those configuration which 
give a configuration in Con{h) after i global rounds, each configuration from 
Con{n) has the same probability to occur after i global rounds. 

After running the global algorithm for j (n) — 1 rounds we get by Lemma 5 a 
configuration uniformly distributed in Con(h) where n\ + n -2 + ... + nk-i = o(n) 
and \nt — Xt-n\ < o{n) for t > k with high probability. A probabilistic analysis 
of the following algorithm eliminating the light classes one by one shows that we 
obtain a linear size k-core with high probability. 

Algorithm 5 

Input: A faulty configuration T>. 

Output: The k-core of <P. 
while There exist light classes in do 
Choose uniformly at random a light class W from all light classes 
and delete W and the edges incident with W. 
od. The classes of degree > k are the k-core of <P. 

In order to perform a probabilistic analysis of this algorithm it is again im- 
portant that the uniform distribution is preserved. A similar result is Proposition 
1 in [6] (for the case of graphs instead of configurations) . 

Lemma 7. If we apply the algorithm above to a uniformly random T> G Con{n), 
{h fixed) for a given number of iterations we get: Conditional on the event (in 
Con{n) ) that the configuration obtained, T, is in Con{nQ, n) n^, ng, . . . , n'j) the 
configuration T is a uniformly random configuration from this space. 
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Lemma 8. We consider probability spaces Con{fi) where the number of heavy 
vertices is > S ■ n. In one round of Algorithm 5 one light class disappears and we 
get < k—1 new light classes. Let Y : Con{h) Af be the number of new light 
classes after one round of Algorithm 5. Let v = z • rzz and tt = {k ■ nk)jv. 
Thus TT is the probability to pick a copy of degree k when picking uniformly at 
random from all copies belonging to edges. Then: 

(a) Pr\Y = 1] = Pr[Bin{deg{W) , n) = 1] + o(l). 

(b) EY < (fc — 1) • TT + o(l). 

The straightforward proof of this lemma is omitted due to lack of space. Our 
next step is to bound tt. 

Lemma 9. tt < (1 — e) /{k — 1) for some e > 0. 

Proof. We will prove that when p = r{k,d) then 7r=l/(fc — 1). Since tt is easily 
shown to be decreasing in p, this proves our lemma. Recall that r{k, d) is defined 
to be the minimum of the function L{\). Therefore, at L(A) = r(k,d), we have 
L (A) = 0. Differentiating L, we get: 




A*(i-A)‘^-i-* 



E (\ ^)AXl-A)"-^-*(z-(d-l)A). (5) 



A simple inductive proof shows that the RHS of (5) is equal to 

Indeed, it is trivially true for k = d, and if it is true for k = r + I then for k = r 
the RHS is equal to 

(r I J) “ A)‘^-i-"(r - 1 - (d - 1 )A) + ^ ~ 

= (r-l)(^“J)A"-i(l-A)'^-’' 

Setting j = i + 1, and multiplying by Ad, the LHS of (5) comes to: 



j — k j — k 



and (6) comes to 






d(/c- 1)| 


yk-lj 


k{k-l) 


Now, since 








^ fc(^)A'=-i(l- 


X)d-k 






1 

1 



this establishes our lemma. 



□ 
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Lemma 10. Algorithm 5 stops after o{n) rounds of the while loop with a linear 
size k-core with high probability (with respect to Con{n)). 

Proof. We define Yi to be the number of light classes remaining after i steps of 
Algorithm 5. By assumption, Yq = o(n). Furthermore, by Lemmas 8 and 9, we 
have EYi < Fq ~ 1 + (^ ~ 1)^ < Fq — e. Furthermore, it is not hard to verify 
that, since there are 0{n) classes of degree k, then so long as i = o{n) we have 

EYij^i < Yi — -e, 

and in particular, the probability that at least i new light vertices are formed 
during step i is less than the probability that the binomial variable Bin{k— 1, tt) 
is at least i. 

Therefore, for any t = o(n), Fq, Yi, Yt is statistically dominated by a ran- 
dom walk defined as: 

1 - ie 

Zo = Yq; Zi+i = Zi — 1 + Bin{k — 1, — — ^). 

Since Zi has a drift of — |e, it is easy to verify that with high probability, Zt = 0 
for some t = o(n), and thus with high probability Ft = 0 as well. 

If Ft = 0 then we are left with a fc-core of linear size. □ 

Clearly Lemma 10 implies Theorem 1(a). 
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Abstract. Haviland and Thomason and Chung and Graham were the first to inves- 
tigate systematically some properties of quasi-random hypergraphs. In particular, 
in a series of articles, Chung and Graham considered several quite disparate prop- 
erties of random-like hypergraphs of density 1/2 and proved that they are in fact 
equivalent. The central concept in their work turned out to be the so called devi- 
ation of a hypergraph. Chung and Graham proved that having small deviation is 
equivalent to a variety of other properties that describe quasi-randomness. In this 
note, we consider the concept of discrepancy for fc-uniform hypergraphs with an 
arbitrary constant density d (0 < d < 1) and prove that the condition of having 
asymptotically vanishing discrepancy is equivalent to several other quasi-random 
properties of H, similar to the ones introduced by Chung and Graham. In partic- 
ular, we give a proof of the fact that having the correct ‘spectrum’ of the s-vertex 
subhypergraphs is equivalent to quasi-randomness for any s > 2k. Our work can 
be viewed as an extension of the results of Chung and Graham to the case of an 
arbitrary constant valued density. Our methods, however, are based on different 
ideas. 



1 Introduction and the Main Result 

The usefulness of random structures in theoretical computer science and in discrete 
mathematics is well known. An important, closely related question is the following: 
which, if any, of the almost sure properties of such structures suffice for a deterministic 
object to have to be as useful or relevant? 

Our main concern here is to address the above question in the context of hypergraphs. 
We shall continue the study of quasi-random hypergraphs along the lines initiated by 
Haviland and Thomason [7,8] and especially by Chung [2], and Chung and Graham [3,4]. 
One of the central concepts concerning hypergraph quasi-randomness, the so called hy- 
pergraph discrepancy, was investigated by Babai, Nisan, and Szegedy [1], who found 
a connection between communication complexity and hypergraph discrepancy. This 
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MCT/FINEP (PRONEX project 107/97). 

G. Gonnet, D. Panario, and A. Viola (Eds.): LATIN 2000, LNCS 1776, pp. 48-57, 2000. 

(c) Springer- Verlag Berlin Heidelberg 2000 
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connection was further studied by Chung and Tetali [5]. Here, we carry out the investi- 
gation very much along the lines of Chung and Graham [3,4], except that we focus on 
hypergraphs of arbitrary constant density, making use of different techniques. 

In the remainder of this introduction, we carefully discuss a result of Chung and 
Graham [3] and state our main result. Theorem 3 below. 



1.1 The Result of Chung and Graham 



We need to start with some definitions. For a set V and an integer k > 2, let [Vf 
denote the system of all fc-element subsets of V. A subset S C is called a k- 
uniform hypergraph. If fc = 2, we have a graph. We sometimes use the notation Q = 
(V{Q), E{Q)). If there is no danger of confusion, we shall identify the hypergraphs with 
their edge sets. Throughout this paper, the integer k is assumed to be a fixed constant. 

For any /-uniform hypergraph Q and k > I, let JCk{G) be the set of all fc-element sets 
that span a clique on k vertices. We also denote by Kk{2) the complete fc-partite 
fc-uniform hypergraph whose every partite set contains precisely two vertices. We refer 
to iCfc(2) as the generalized octahedron, or, simply, the octahedron. 

We also consider a function p-u- such that, for all e G [H]^, we 

have 






-1, ifeG-H 

1, ife^-H. 



Let [/c] = {1,2,..., fc}, let denote the set of all 2fc-tuples {yx,V 2 , . ■ ■ , W 2 fc)> where 
Wi G y (1 < i < 2k), and let y^^ — >■ (—1, 1} be given by 






where the product is over all vectors £ = with £j G {ui, Vi] for all i and we 

understand p-u to be 1 on arguments with repeated entries. 

Following Chung and Graham (see, e.g., [4]), we define the deviation dev("H) of T~L 
by 

dev("H) = ^ n^\ui,...,Uk,vi,...,Vk). 

Mi, Vi 6 y ie[fc] 

For two hypergraphs Q and T~L , we denote by the set of all induced subhypergraphs 

of Ti, that are isomorphic to G- We also write for the number of weak (i.e., not 
necessarily induced) subhypergraphs of T~L that are isomorphic to G- Furthermore, we 
need the notion of the link of a vertex. 



Definition! Let % be a k-uniform hypergraph and x G V {%). We shall call the (k-1)- 
uniform hypergraph 

H{x) = (e \ {a;}: e G "H, x G e| 

the link of the vertex x in TL. For a subset W C V{T-L), the joint W -link is T~L{W) = 
Clx&w For simplicity, ifW = {xi , . . . , Xk}, we write T~L{xi , . . . , Xk). 
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Observe that if % is fc-partite, then T~L{x) is (fc — 1) -partite for every x & V . Further- 
more, if A: = 2, then T~L{x) may be identified with the ordinary graph neighbourhood of 
X. Moreover, T~L{x, x') may be thought of as the ‘joint neighbourhood’ of x and x' . 

In [3], Chung and Graham proved that if the density of an m-vertex /c-uniform 
hypergraph T~L is 1/2, i.e., \H\ = (1/2 -f o(l))(™), where o(l) —>■ 0 as m —>■ oo, then 
the following statements are equivalent: 



(Qi(s)) for all /c-uniform hypergraphs on s > 2k vertices and automorphism group 
Aut(G), 



= (l + o(D)(/)2-«) 



5 ! 



|Aut(i;)|’ 



(Q 2 ) for all fc-uniform hypergraphs t/ on 2k vertices and automorphism group Aut(G), 
we have 




(1 + 0 ( 1 )) 



/ m\ {2k)l 

\2k) |Aut(G)|’ 



(Qa) dev{n) = 0 ( 1 ), 

(Q4) for almost all choices of vertices x, y G V, the {k — l)-uniform hypergraph 
'H{x)A'H{y), that is, the complement \ {'H{x)A'H{y)) of the symmetric 

difference of TL{x) and TL{y), satishes Q 2 with k replaced hy A: — 1, 

(Qs) fori < r < 2A: — 1 and almost all x,y G V, 



fn{x,y)\ 






The equivalence of these properties is to he understood in the following sense. If we 
have two properties P — P(o(l)) and P' — P'{o{l)), then “P ^ P'” means that 
for every e > 0 there is a i5 > 0 so that any A:-uniform hypergraph TL on m vertices 
satisfying P{S) must also satisfy P'{e), provided m > Mo{e). 

In [3] Chung and Graham stated that “it would be profitable to explore quasi- 
randomness extended to simulating random A:-uniform hypergraphs Gp(n) forp ^ 1 /2, 
or, more generally, for p = p{n), especially along the lines carried out so fruitfully hy 
Thomason [13,14].’’ Our present aim is to explore quasi-randomness from this point of 
view. In this paper, we concentrate on the case in which p is an arbitrary constant. In 
certain crucial parts, our methods are different from the ones of Chung and Graham. 
Indeed, it seems to us that the fact that the density of "H is 1 /2 is essential in certain 
proofs in [3] (especially those involving the concept of deviation). 



1.2 Discrepancy and Subgraph Counting 

The following concept was proposed by Frankl and Rodl and later investigated by 
Chung [2] and Chung and Graham in [3,4]. For an m-vertex A:-uniform hypergraph 
H with vertex set V, we define the density d{TL) and the discrepancy disci/ 2 ('H) of T~L 
by letting d{'H) = \'H\ (™) ^ and 

disci/2('H) = ^ max I |"H n /Cfc(C/)| - l/f n /Cfc(C/)| I , 

Sc[v]'=-i I I 



( 1 ) 
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where the maximum is taken over all (fc — 1) -uniform hypergraphs Q with vertex set V, 
and a is the complement [V]'^\'Hof'H. 

To accommodate arbitrary densities, we extend the latter concept as follows. 

Definition 2 Let % be a k-uniform hypergraph with vertex set V with \V\ = m. We 
define the discrepancy disc{'H) ofiLL as follows: 

disc("H) = ^ max llLL (1 lCk{Q)\ - d{'H)\ICk{G)\\, ( 2 ) 

where the maximum is taken over all (k — l)-uniform hypergraphs G with vertex set V. 

Observe that if d{Ti) = 1/2, then disc('H) = (1/2) disci/ 2 ("H), so both notions 
are equivalent. Following some initial considerations by Frankl and Rodl, Chung and 
Graham investigated the relation between discrepancy and deviation. In fact, Chung [2] 
succeeded in proving the following inequalities closely connecting these quantities: 

O') dev('H) < 4 ''(disci/ 2 ('H))^/^*’, 

{ii) disci/2(’H) < (dev('H))^/^*’. 

For simplicity, we state the inequalities for the density 1/2 case. For the general case, 
see Section 5 of [2]. 

Before we proceed, we need to introduce a new concept. If the vertex set of a 
hypergraph is totally ordered, we say that we have an ordered hypergraph. Given two 
ordered hypergraphs G< and %<!, where < and <' denote the orderings on the vertex 
sets ofG — G< and T~L = %<' , we say that a function /: V (G) —>■ C("H) is an embedding 
of ordered hypergraphs if (i) it is an injection, {ii) it respects the orderings, i.e., f{x) <' 
f{y) whenever x < y, and (Hi) f{g) G H if and only if g G G, where f{g) is the set 
formed by the images of all the vertices in g. Furthermore, if G = G< and TL = %<', 
we write ^ for the number of such embeddings. 

As our main result, we shall prove the following extension of Chung and Graham’s 
result. 



Theorem 3 Let LL = {V, E) be a k-uniform hypergraph of density 0 < d < 1. Then the 
following statements are equivalent: 

(Pi) disc("H) = o(l), 

(P 2 ) disc("H(a;)) = o{l) for all but o{m) vertices x G V and disc{'H{x,y)) = o(l) 
for all but o{mf) pairs x, y GV, 

(P 3 ) disc("H(a;, y)) = off) for all but ofmf) pairs x, y GV, 

(^ 4 ) the number of octahedra Kk{2) in % is asymptotically minimized among all 
k-uniform hypergraphs of density d; indeed. 



i ^ V 

\Kk{2)) 



= (1 + 0 ( 1 )) 



-,2k 



2'^fc! 



(n) for any s > 2k and any k-uniform hypergraph G on s vertices with e{G) edges 
and automorphism group Aut(f/), 



= (1 + 0 ( 1 )) 
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(PD for any ordering 'H<c of 'H and for any fixed integer s > 2k, any ordered k-uniform 
hypergraph Q< on s vertices with e{Q) edges is such that 



ord 






(Pe) for all k-uniform hypergraphs Q on 2k vertices with e(Q) edges and automorphism 
group Aut(t/), 






(2fc)! 

Aut(C/)| ■ 



for any ordering "H< ofH, any ordered k-uniform hypergraph on 2k vertices 
with e{Q) edges is such that 




(1 + o(l)) 

\2k J 



Some of the implications in Theorem 3 are fairly easy or are by now quite standard. 
There are, however, two implications that appear to be more difficult. 

The proof of Chung and Graham that devi/2('H) = o(l) implies P5 (the ‘subgraph 
counting formula’) is based on an approach that has its roots in a seminal paper of 
Wilson [15]. This beautiful proof seems to make non-trivial use of the fact that d{T-L) = 
1/2. Our proof of the implication that small discrepancy implies the subgraph counting 
formula (Pi P/) is based on a different technique, which works well in the arbitrary 
constant density case (see Section 2.2). 

Our second contribution, which is somewhat more technical in nature, lies in a novel 
approach for the proof of the implication P 2 ^ Pi . Our proof is based on a variant of 
the Regularity Lemma of Szemeredi [12] for hypergraphs [6] (see Section 2.1). 



2 Main Steps in the Proof of Theorem 3 

2.1 The First Part 

The first part of the proof of Theorem 3 consists of proving that properties Pi , . . . , P4 are 
mutually equivalent. As it turns out, the proof becomes more transparent if we restrict 
ourselves to A: -partite hypergraphs. In the next paragraph, we introduce some definitions 
that will allow us to state the A: -partite version of Pi , . . . , P4 (see Theorem 15). We close 
this section introducing the main tool in the proof of Theorem 15, namely, we state a 
version of the Regularity Lemma for hypergraphs (see Lemma 20). 

Definitions for Partite Hypergraphs. For simplicity, we first introduce the term cylin- 
der to mean partite hypergraphs. 

Definition 4 Let k > I > 2 be two integers. We shall refer to any k-partite l-uniform 
hypergraph TL = (Vi U . . . U Vk,E) as a k-partite l-cylinder or {k , 1) -cylinder. If 
I = k — 1, we shall often write Hi for the subhypergraph ofH induced on 

Clearly, H — lj?=i P-i- shall also denote by . . . , Vk) the complete {k, l)- 

cylinder with vertex partition Vi U . . . U Vk. 
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Definition 5 For a {k, l)-cylinder FL, we shall denote by ICj{'H), I < j < k, the (k,j)- 
cylinder whose edges are precisely those j-element subsets ofV (FI) that span cliques 
of order j in FL. 



When we deal with cylinders, we have to measure density according to their natural 
vertex partitions. 



Definition 6 Let FI be a (k, k)-cylinder with k-partition V = Vi U . . . U V^. We define 
the k-partite density or simply the density d(FL) ofFL by 



d(Fi) 



m 



To he precise, we should have a distinguished piece of notation for the notion of 
fc-partite density. However, the context will always make clear which notion we mean 
when we talk about the density of a (k, fc)-cylinder. 

We should also be careful when we talk about the discrepancy of a cylinder. 



Definition 7 Let FL be a (k, k)-cylinder with vertex set V = Vi U . . . U Vk- We define 
the discrepancy disc(FL) ofFL as follows: 

disc('H) = I I ^ IT. I max||'Hn/Cfc(g)| - d(Fi)\lCk(Q)\\, (3) 

where the maximum is taken over all (fc, k — l)-cylinders Q with vertex set 1^ = Vi U 



We now introduce a simple but important concept concerning the “regularity” of a 
(fc, /c) -cylinder. 

Definition 8 Let FL be a (k, k)-cylinder with k-partition V = Vl U . . . U 14 and let 
5 < a be two positive real numbers. We say that FL is (a, 6) -regular if the following 
condition is satisfied: ifQ is any (k, k — 1) -cylinder such that \lCk(Q)\ > ^|14| • ■ • 
then 

{a-5)\lCk(g)\ < \HCMCk(Q)\ < (a + 5)\lCk(Q)\. (4) 



Lemma 9 Let FL be an (a, S)-regular (k, k)-cylinder. Then disc('H) < 26. 



Lemma 10 Suppose FL is a (k, k)-cylinder with k-partition V = Vi U . . . U 14. Put 
a = d(FL) and assume that disc("H) < 6. Then FL is (a, 6^^^) -regular. 

The fc-Partite Result. Suppose FL is a fc-uniform hypergraph and let FL' be a ‘typical’ 
fc-partite spanning subhypergraph of "H. In this section, we relate the discrepancies of FL 
andFi'. 

Definition 11 Let FL = (V,E) be a k-uniform hypergraph with m vertices and let 
V — (Vi)\ be a partition ofV. We denote by FL-p the (k, k)-cylinder consisting of the 
edges h € FL satisfying \hC\Vi\ = 1 for all 1 < i < k. 
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The following lemma holds. 

Lemma 12 For any partition V = {Vi)\ ofV , we have 

(i) disc(?f) > \d{nv) - d{n)\\Vi\. . . \Vk\/m\ 

(ii) disc("H-p) < 2 disc('H)TO''/| Vi | . . . |Vfc|- 

An immediate consequence of the previous lemma is the following. 

Lemma 13 ^disc("H) = o(l), then disc("H-p) = o(l) for (1 — o(l))fc’" partitions 
V = ofV. 

With some more effort, one may prove a converse to Lemma 13. 

Lemma 14 Suppose there exists a real number 7 > 0 such that disc('H'p) = o(l) for 
7 /c™ partitions V = (Vi)i ofV. Then disc('H) = o(l). 

We now state the fc-partite version of a part of our main result, Theorem 3. 

Theorem 15 Suppose L = Vi U . . . U 14, |Vi| = . . . = |14| = n, and let H = {V, E) 
be a {k, k) -cylinder with \T-L\ = dnf. Then the following four conditions are equivalent: 

(Cl) FL is {d, o{l))- regular; 

(C2) 'H(x) is {d,o{l))-regular for all but o(n) vertices x G 14 and 'H{x,y) is 
{df, o{l)) -regular for all but o{nf) pairs x, y € 14 / 

(C3) 'H{x, y) is {df , o{l))-regular for all but o{vf) pairs x, y € 14/ 

(C4) the number of copies of Kk{2) in FI is asymptotically minimized among all such 
{k, k)-cylinders of density d, and equals (1 + / 2 ^. 

Remark 1. The condition 1 14 1 = . . . = 1 14 1 = n in the result above has the sole purpose 
of making the statement more transparent. The immediate generalization of Theorem 15 
for 14 , • • ■ , 14 of arbitrary sizes holds. 

Remark 2. The fact that the minimal number of octahedra in a (fc, /c)-cylinder is asymp- 
totically (1 -f o{l))nf^df j2f is not difficult to deduce from a standard application of 
the Cauchy-Schwarz inequality for counting “cherries” (paths of length 2) in bipartite 
graphs. 

We leave the derivation of the equivalence of properties Pi , . . . , P 4 from Theorem 15 
to the full paper. 

A Regularity Lemma. The hardest part in the proof of Theorem 15 is the implication 
C 2 Cl . In this paragraph, we discuss the main tool used in the proof of this implication. 
It turns out that, in what follows, the notation is simplified if we consider [k -f 1) -partite 
hypergraphs. 

Throughout this paragraph, we let Q be a fixed (fc -f 1, /c) -cylinder with vertex set 
V(G) = ViU. . .UVfc+i. Recall that C/ = Ui^i I/i, where C/i is the corresponding (fc, fc)- 
cylinder induced on ^ *^his section, we shall focus on “regularizing” the (fc, k)- 
cylinders Qi, . . . , Gk, ignoring C/fc+i. Alternatively, we may assume that Gk+i — 0- 
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Definition 16 Let T = lj?=i a {k,k — l)-cylinder with vertex setV\\J . . .\J Vk- 
For a vertex v G Vk+i, we define the Q-link Gj^(x) of x with respect to T to be the 
(fc, k — l)-cylinder Qj^(x) = Q{x) fl T . 



Definition 17 Let W C and let T = fFi be as above. We shall say that the 
pair {T, W) is (e, d) -regular if 



|/Cfc(g^(x))| 

|/Cfc(^)| 



(5) 



for all but at most s\W\ vertices x G W, and 

\iCk{GA^)) rMCk{GAv))\ ,2 

\iCk{:F)\ 

for all but at most e\W\'^ pairs x, y G W. 



( 6 ) 



Definition 18 Let t be a positive integer and let Vk+i = W\ U . . . U Wt be an arbitrary 
partition ofVk+i- For every i G [k], consider a t-partition = {s'f^ , . . . of 

Fl X . . . X y,_i X X . . . X Ffc = ULi = {Pi^ > • ■ • > Pk^)- ^hall 

write £{P^P) for the collection of all (fc, k — l)-cylinders £ of the form £a} U . . . 
where £a} G for all 1 < i < k. 

Clearly, with the notation as above, we have \£{P^*'>)\ = Moreover, observe that 
each of the pairs {£, Wi), where £ G £{P^P) and 1 < i < t, may be classified as 
£-regular or £-irregular (i.e., not £-regular), according to Definition 17. Also, notice that 
each V = {vi, . . . , Vk+i) G Vi x . . . x 14+i is ‘covered’ by exactly one such pair, that 
is, V G lCk{£) X Wi for a unique pair (£, Wi). 

Definition 19 Let = {Pi*'^) \ and be as in Definition 18. We shall say that 

the system of partitions {Pi*\ ■ • ■ , {W \, . . . , VCt}} is e -regular if the number of 

(fc + l)-tuples (fl, . . . , ffc+i) G Cl X ... X Vfc+i that are not covered by the family of 
e-regular pairs {£, Wi) with £ G £{P'P) and 1 < i < t is at most £| C | . . . | Cc+i |. 

The main tool in the proof of C 2 ^ Ci is the following result (see [9] for the details). 

Lemma 20 For every £ > 0 and to > 1, there exist integers no and Tq such that every 
(fc+ 1, k)-cylinder G = \Ji=i Si vertex set C U . . . U Vfc+i, where \Vi\ > no Vi, 
1 < z < k-\-l, admits an e-regular system of partitions {Pi*\ ..., pj.P , {Wi, ... ,Wt}} 
with to < t < To. 

2.2 The Subgraph Counting Formula 

In this section, we shall state the main result that may be used to prove the implication 
Pi Pg. To this end, we need to introduce some notation. Throughout this section, 
s > 2/c is some fixed integer. 
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If % and Q are, respectively, fc-uniform and ^-uniform (fc > f), then we say that % 
is supported onQUHC ICk{G)- 

Suppose we have pairwise disjoint sets Wi , ■ ■ ■ , Wg , with \ Wi \ = n for all i. Suppose 
further that we have a sequence . . . , of s-partite cylinders on W\ U . . . U Wg, 
with an (s, i)-cylinder and, moreover, such that is supported on for all 

3 < i < k. Suppose also that, for a\\2 < i < k and for all 1 < ji < ■ ■ ■ < ji < s, 
the (i,i)-cylinder , jj] = U . . . U Wj^] is ( 7 i, S) -regular with respect 

. . . ,jj] = . .ulf/j-j, that is, whenever^ C ■ • ■ , ji] 

is such that \1C^{G)\ > we have 

{i^-5)\k.,{G)\ < |Sb-i,...,jj]n/c,(a)| < ( 7 * + <5)|/C,(0)|. 

Finally, let us say that a copy of in IFi U . . . U Wg is transversal if | V (Kg ^^ ) H IF^ | = 

1 for all 1 < i < s. 

Our main result concerning counting subhypergraphs is then the following. 

Theorem 21 For any £ > 0 and any 72 , ■ • ■ , 7 fc >0, there is > 0 such that if 5 < Sq, 
then the number of transversal in G^^'^ is (1 + 0{e))yj^'"^ . . . Tnf. 

Theorem 21 above is an instance of certain counting lemmas developed by Rodl and 
Skokan for such complexes Q = 2 <a<k bl])- 

3 Concluding Remarks 

We hope that the discussion above on our proof approach for Theorem 3 gives some 
idea about our methods and techniques. Unfortunately, because of space limitations and 
because we discuss the motivation behind our work in detail, we are unable to give more 
details. We refer the interested reader to [9]. 

It is also our hope that the reader will have seen that many interesting questions 
remain. Probably, the most challenging of them concerns developing an applicable theory 
of sparse quasi-random hypergraphs. Here, we have in mind such lemmas for sparse 
quasi-random graphs as the ones in [10]. 
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Abstract. The Cube Packing Problem (CPP) is defined as follows. Find 
a packing of a given list of (small) cubes into a minimum number of 
(larger) identical cubes. We show first that the approach introduced by 
Coppersmith and Raghavan for general online algorithms for packing 
problems leads to an online algorithm for CPP with asymptotic perfor- 
mance bound 3.954. Then we describe two other offline approximation 
algorithms for CPP: one with asymptotic performance bound 3.466 and 
the other with 2.669. A parametric version of this problem is defined and 
results on online and offline algorithms are presented. We did not find in 
the literature offline algorithms with asymptotic performance bounds as 
good as 2.669. 



1 Introduction 

The Cube Packing Problem (CPP) is defined as follows. Given a list L of n cubes 
(of different dimensions) and identical cubes, called bins, find a packing of the 
cubes of L into a minimum number of bins. The packings we consider are all 
orthogonal. That is, with respect to a fixed side of the bin, the sides of the cubes 
must be parallel or orthogonal to it. 

CPP is a special case of the Three-dimensional Bin Packing Problem (3BP). 
In this problem the list L consists of rectangular boxes and the bins are also 
rectangular boxes. Here, we may assume that the bins are cubes, since otherwise 
we can scale the bins and the boxes in L correspondingly. 

In 1989, Coppersmith and Raghavan [6] presented an online algorithm for 
3BP, with asymptotic performance bound 6.25. Then, in 1992, Li and Cheng 
[11] presented an algorithm with asymptotic performance bound close to 4.93. 
Improving the latter result, Csirik and van Vliet [7], and also Li and Cheng 
[10] designed algorithms for 3BP with asymptotic performance bound 4.84 (the 
best bound known for this problem). Since CPP is a special case of 3BP, these 

* This work has been partially supported by Project ProNEx 107/97 (MCT/FINEP), 
FAPESP (Proc. 96/4505-2), and CNPq individual research grants (Proc. 300301/98- 
7 and Proc. 304527/89-0). 
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algorithms can be used to solve it. Our aim is to show that algorithms with 
better asymptotic performance bounds can be designed. 

Results of this kind have already been obtained for the 2-dimensional case, 
more precisely, for the Square Packing Problem (SPP). In this problem we are 
given a list of squares and we are aked to pack them into a minimum number 
of square bins. In [6], Coppersmith and Raghavan observe that their technique 
leads to an online algorithm for SPP with asymptotic performance bound 2.6875. 
They also proved that any online algorithm for packing d-dimensional squares, 
d > 2, must have asymptotic performance bound at least 4/3. Ferreira, Miyazawa 
and Wakabayashi [9] presented an offline algorithm for SPP with asymptotic 
performance bound 1.988. For the more general version of the 2-dimensional 
case, where the items of L are rectangles (instead of squares), Chung, Carey and 
Johnson [2] designed an algorithm with asymptotic performance bound 2.125. 

For more results on packing problems the reader is referred to [1,3, 4,5, 8]. 

The remainder of this paper is organized as follows. In Section 2 we present 
some notation and definitions. In Section 3 we describe an online algorithm 
for CPP that uses an approach introduced by Coppersmith and Raghavan [6], 
showing that its asymptotic performance bound is at most 3.954. In Section 4 
we present an offline algorithm with asymptotic performance bound 3.466. We 
mention a parametric version for these algorithms and derive asymptotic per- 
formance bounds. In Section 5 we present an improved version of the offline 
algorithm described in Section 4. We show that this algorithm has asymptotic 
performance bound 2.669. Finally, in Section 6 we present some concluding re- 
marks. 

2 Notation and Definitions 

The reader is referred to [14] for the basic concepts and terms related to packing. 
Without loss of generality, we assume that the bins have unit dimensions, since 
otherwise we can scale the cubes of the instance to fulfill this condition. 

A rectangular box b with length x, width y and height 2 : is denoted by a 
triplet b = {x,y,z). Thus, a cube is simply a triplet of the form (x,x,x). The 
size of a cube c = (x,x,x), denoted by s(c), is x. Here we assume that every 
cube in the input list L has size at most l.The volume of a list L, denoted by 
V (L), is the sum of the volumes of the items in L. 

For a given list L and algorithm A, we denote by A{L) the number of bins 
used when algorithm A is applied to list L, and by OPT(L) the optimum number 
of bins for a packing of L. We say that an algorithm A has an asymptotic 
performance bound a if there exists a constant f3 such that 

A{L) < a ■ OPT(L) -I- /J, for all input list L. 

If /? = 0 then we say that a is an absolute performance bound for algorithm A. 

If P is a packing, then we denote by ff{V) the number of bins used in V . 

An algorithm to pack a list of items L = (ci,...,c„) is said to be online 
if it packs the items in the order given by the list i, without knowledge of the 
subsequent items on the list. An algorithm that is not online is said to be offline. 
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We consider here a parametric version of CPP, denoted by CPPm, where m 
is a natural number. In this problem, the instance L consists of cubes with size 
at most 1/m. Thus CPPi and CPP are the same problem. 

3 The Online Algorithm of Coppersmith and Raghavan 

In 1989, Coppersmith and Raghavan [6] introduced an online algorithm for the 
multidimensional bin packing problem. In this section we describe a specialized 
version of this algorithm for CPP. Our aim is to derive an asymptotic perfor- 
mance bound for this algorithm (not explicitly given in the above paper). 

The main idea of the algorithm is to round up the dimensions of the items 
in L using a rounding set S = {1 = sq, si) • ■ • > Si , . . .}, Si > s^+i. The first step 
consists in rounding up each item size to the nearest value in S. The rounding 
set S for CPP is S' := S'! U 52 U S3, where 

Si = {!}, S2 = {1/2, 1/4, . . . , l/2^ . . .}, S3 = (1/3, 1/6, . . . , 1/(3 • 2^=), . . .}. 

Let X be the value obtained by rounding up x to the nearest value in S. Given 
a cube c = (x, x, x), define c as the cube c := (x, x, x). Let L be the list obtained 
from L by rounding up the sizes of the cubes to the values in the rounding set 
S. The idea is to pack the cubes of the list L instead of L, so that the packing of 
each cube c G L represents the packing of c G L. The packing of L is generated 
into bins belonging to three different groups: Gi, G 2 and G3. Each group Gi 
contains only bins of dimensions (x, x, 1), x G S^, i = 1, 2, 3. A bin of dimension 
(x,x, 1) will have only cubes c = (x,x,x) packed into it. We say that a cube 
c = (x, X, x) is of type t, if x G Sj, i = 1,2, 3. 

To pack the next unpacked cube c € L with size x G Si, we proceed as 
follows. 

1. Let B £ Gi he the first bin B = (x,x, 1), such that + 2: < 1 (if 

there exists such a bin B). 

2. If there is a bin i? as in step 1, pack c in a Next Fit manner into B. 

3. Otherwise, 

a) take the first empty bin G = {y, y, 1), y G Si, with y > x and y as small 
as possible. If there is no such bin G, take a new bin (1, 1, 1) and replace 
it by bins of dimensions (i, i, 1) and let G = {y,y, 1) be the first of 
these bins. 

b) If y > X, then replace G by other four bins of dimensions (|,|,1). 
Continue in this manner replacing one of these new bins by four bins, 
until there is a bin G of dimension G = (^, ^, 1) with ^ = x. 

c) Pack c in a Next Fit manner into the first bin G. 

4. Update the group Gi. 

Let us now analyse the asymptotic performance of the algorithm we have 
described. Consider V the packing of L generated by this algorithm, Li the set 
of all cubes of type i in L, and Vi the set of bins of V having only cubes of 
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type i. Now, let us consider the bins B = {x,x,l), x G Si, and compute the 
volume occupied by the cubes of L that were packed into these bins. All bins in 
the group G\ are completely filled. Thus, #{V\) = V{Li). For the bins in the 
groups G2 and G3 the unoccupied volume is at most 1 for each group. Therefore, 
we have #(P*) < V{Li) + 1, z = 2,3. 

Now, let us consider the volume we increased because of the rounding process. 
Each cube c £ Li has volume at least | of c, and each cube c G T2 U L3 has 
volume at least ^ of c. Hence, we have the following inequalities: 

#(iPi) < and #(iP2 U V3) < ^^(^2 U L3) + 2. 

Let rzi := #(Pi) and ri23 := #(7^2 U P3) — 2. Thus, using the inequalities 
above and the fact that the volume of the cubes in L is a lower bound for the 
optimum packing, we have OPT(L) > V(L) > + §fU23- 

Since OPT(L) > m, it follows that OPT(L) > max{m, |rzi + ^7123}. Now 
using the fact that #(P) = #{Pi) + #(7^2 U V3) = n-i + U23 + 2, we have 

#(iP) < a-OPT(L) + 2, 

where a = (rzi +rz23)/(max{ni, |ni + ^7123}). Analysing the two possible cases 
for the denominator, we obtain a < 3.954. 

The approach used above can also be used to develop online algorithms for 
the parametric version CPP^. In this case we partition the packing into two 
parts. One part is an optimum packing with all bins, except perhaps the last 
(say n' bins), filled with w? cubes of volume at least (l/(m + 1))^ each. The 
other part is a packing with all bins, except perhaps a fixed number of them 
(say n" bins), having an occupied volume of at least ((m + l)/(m + 2))^. 

It is not difficult to show that the asymptotic performance bound am of 
CPPm is bounded by (n'+n")/(max{n', (m/(m+l))^n'+((m+l)/(m+2))^n"}). 
For m = 2 and m = 3 these values are at most 2.668039 and 2.129151, respec- 
tively. 

4 An Offline Algorithm 

Before we present our first offline algorithm for GPP, let us describe the algo- 
rithm NFDH (Next Fit Descreasing Height), which is used as a subroutine. 

NFDH first sorts the cubes of L in nonincreasing order of their size, say 
Cl, C2, . . . , c„. The first cube ci is packed in the position (0,0,0), the next one 
is packed in the position (s(ci),0,0) and so on, side by side, until a cube that 
does not fit in this layer is found. At this moment the next cube Ck is packed in 
the position (0, s(ci), 0). The process continues in this way, layer by layer, until 
a cube that does not fit in the first level is found. Then the algorithm packs this 
cube in a new level at height s(ci). When a cube cannot be packed in a bin, it 
is packed in a new bin. The algorithm proceeds in this way until all cubes of L 
have been packed. 
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The following results will be used in the sequel. The proof of Lemma 1 is left 
to the reader. Theorem 2 follows immediately from Lemma 1. 

Theorem 1 (Meir and Moser [12]). Any list L of k-dimensional cubes, with 
sizes Xi > X 2 > ■ ■ ■ > Xn > ■ ■ ■ , can he packed by algorithm NFDH into only one 
k-dimensional rectangular parallelepiped of volume ai x 02 x ... x Uk if aj > xi 
(j = 1, . . . ,k) and x\ + (ai — Xi)(a 2 — xi) ■ ■ ■ {ok — xi) > V{L). 

Lemma 1. For any list of cubes L = (ci,...,c„) such that x{ci) < the 
following holds for the packing of L into unit bins: 

NFDH(L) < ((to + 1)/to)^ V{L) + 2. 

Theorem 2. For any list of cubes L = ( 61 ,..., 6 „) such that x{bi) < — , the 
following holds for the packing of L into unit bins: 

NFDH(L) < ((to + 1 ) /mf OPT(L) + 2. 

Before presenting the first offline algorithm, called CUBE, let us introduce a 
definition and the main ideas behind it. 

If a packing P of a list L satisfies the inequality ff{V) < + C, where v 

and C are constants, then we say that is a volume guarantee of the packing 
V (for the list L). Algorithm CUBE uses an approach, which we call critical set 
combination (see [13]), based on the following observation. 

Recall that in the analysis of the performance of the algorithm presented 
in Section 3 we considered the packing divided into two parts. One optimum 
packing, of the list Li, with a volume guarantee |, and the other part, of the 
list L 23 = L 2 0 L 3 , with a volume guarantee If we consider this partition of 
L, the volume we can guarantee in each bin is the best possible, as we can have 
cubes in Li with volume very close to and cubes in L 23 for which we have 
a packing with volume occupation in each bin very close to In the critical 
set combination approach, we first define some subsets of cubes in Li and L 23 
with small volumes as the critical sets. Then we combine the cubes in these 
critical sets obtaining a partial packing that is part of an optimum packing and 
has volume occupation in each bin better than |. That is, sets of cubes that 
would lead to small volume occupation are set aside and they are combined 
appropriately so that the resulting packing has a better volume guarantee. 

Theorem 3. For any list L of cubes for CPP, we have 

CUBE(L) < 3.466 • OPT(L) + 4. 

Proof. First, consider the packing Vab- Since each bin of Vab, except perhaps 
the last, contains one cube of La and seven cubes oi Lb, we have 

where Lab is the set of cubes of L packed in Vab ■ 
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Algorithm CUBE 

// To pack a list of cubes L into unit bins B = (1, 1, 1). 

1 Let p = 0.354014; and La, Lb be sublists of L defined as follows. 

La {c e Li : | < s(c) < (1 -p)}, is {c G L 2 : | < s(c) < p}. 

2 Generate a partial packing Vab of La U Lb, such that Vab is the union of packings 

Vab , ■ ■ ■ , 'Pab , where Vab is a packing generated for one bin, consisting of one 
cube of La and seven cubes of Lb, except perhaps the last (that can have fewer 
cubes of Lb)- [The packing Vab will contain all cubes of La or all cubes of Lb] 
Update the list L by removing the cubes packed in Vab- 

3 V' -h- NFDH(L); 

4 Return V' U Vab- 
end algorithm. 



Now consider a partition of V into three partial packings Vi, V 2 and Vz, 
defined as follows. The packing V\ has the bins of V' with at least one cube of 
size greater than | . The packing V 2 has the bins of V' \ V\ with at least one 
cube of size greater than The packing Vz has the remaining bins, z.e., the 
bins in V' \ {V\ U V 2 )- Let Li be the set of cubes packed in Vi, i = 1,2, 3. 

Since all cubes of L3 have size at most 1/3, and they are packed in Vz with 
algorithm NFDH, by Lemma 1, we have 

#{Vz) < + 2. (2) 

Case 1. Lb is totally packed in Vab- 

In this case, every cube of L\ has volume at least |. Therefore 

#(Pi) < (3) 

Now, since every cube of L 2 has size at least p and each bin of packing V 2 has 
at least 8 cubes of L 2 , we have 

*{V2)<^V{L2) + 1- (4) 

Since = min/Sp^, ||, ||}, using (1), (2) and (4), and setting Vaux '-= 
Vab 'JVz'JV 2 , we have 

#{Vaux)<^V{Laux)+4,- (5) 

Clearly, Vi is an optimum packing of Li, and hence 



#(Pi) < OPT(L). 



(6) 
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Defining h\ := #(Pi) and /12 := #(’Pa«a;) —4, and using inequalities (3), (5) and 
(6) we have 



CUBE(L) < a' ■ OPT(L) + 4, 

where a' = {hi + /i 2 )/(max{/ii, ^hi + 8 p^h 2 }) < 3.466. 

Case 2. La is totally packed in Pab- 

In this case, the volume guarantee for the cubes in Li is better than the one 
obtained in Case 1. Each cube of Li has size at least 1 — _p. Thus, #{Pi) < 
(i-p) 3 ^(-^i)- For the packing V 2 , we obtain a volume guarantee of at least 
and the same holds for the packings V3 and Tab- Thus, for Paux as above, 
^{Paux) < g^P(Ta«a:) + 4. 

Since #(Pi) < OPT(L), combining the previous inequalities and proceeding 
as in Case 1, we have 



CUBE(L) < a' ■ OPT(T) + 4, 

where a" = {hi + /i 2 )/(max{/ii, (1 —p)^hi + ^^ 2 }) < 3.466. 

The proof of the theorem follows from the results obtained in Case 1 and 
Case 2. We observe that the value of p was obtained by imposing equality for 
the values of a' and a". □ 



Algorithm CUBE can also be generalized for the parametric problem CPP^. 
The idea is the same as the one used in algorithm CUBE. The input list is first 
subdivided into two parts. Pi and P 2 . Part Pi consists of those cubes with size in 
I , and part P 2 consists of the remaining cubes. The critical cubes in each 



1 j_ 

m+1 ’ m 



part are defined using an appropriate value of p = p{m), and then combined. 
The analysis is also divided into two parts, according to which critical set is 
totally packed in the combined packing. It is not difficult to derive the bounds 
a(CUBEm) that can be obtained for the corresponding algorithms. For m = 2 
and TO = 3 the values of a(CUBE„) are at most 2.42362852 {p = 0.26355815) 
and 1.98710756 {p = 0.20916664), respectively. 



5 An Improved Algorithm for CPP 

We present in this section an algorithm for the cube packing problem that is 
an improvement of algorithm CUBE described in the previous section. For that, 
we consider another restricted version of CPP, denoted by CPP^, where k is an 
integer greater than 2. In this problem the instance is a list L consisting of cubes 
of size greater than p We use in the sequel the following result for CPP^. 

Lemma 2. There is a polynomial time algorithm to solve CPP^. 

Proof. Let Li = {c £ L : s{c) > 4} and L 2 = L\Li. Without loss of generallity, 
consider Li = (ci, . . . , Ck). Pack each cube Cj € Ti in a unit bin Ci at the corner 
(0, 0, 0). Note that it is possible to pack seven cubes with size at most 1 — s{ci) 
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( 1 ) 

in each bin Ci. Now, for each bin Ci, consider seven other smaller bins , 
j = 1, . . . , 7, each with size 1 — s(ci). Consider a bipartite graph G with vertex 
set X UY, where X is the set of the small bins, and Y is precisely L2. In G 

there is an edge from a cube c G L2 to a bin G^ G X if and only if c can be 
(7) 

packed into . Clearly, a maximum matching in G corresponds to a maximum 
packing of the cubes of L2 into the bins occupied by the cubes of Li. Denote by 
V12 the packing of L\ combined with the cubes of L2 packed with the matching 
strategy. The optimum packing of L can be obtained by adding to the packing 
7^1 2 the bins packed with the remaining cubes of L2 (if existent), each with 8 
cubes, except perhaps the last. □ 

We say that a cube c is of type G, resp. M, if s(c) G (5, l] , resp. s(c) G (|, 5] • 

Lemma 3. It is possible to generate an optimum packing of an instance o/CPP^ 
such that each bin, except perhaps one, has one of the following configurations: 

(a) Cl: configuration consisting of 1 cube of type G and 7 cubes of type M; 

(b ) C2: configuration consisting of exactly 1 cube of type G; and 

(c) C3: configuration consisting of 8 cubes of type M. 

Lemma 2 shows the existence of a polynomial time optimum algorithm for 
CPP^. In fact, it is not difficult to design a greedy-like algorithm to solve CPP^ in 
time O(nlogn). Such an algorithm is given in [9] for SPP^ (defined analogously 
with respect to SPP). 

We are now ready to present the improved algorithm for the cube packing 
problem, which we call ICUBE (Improved CUBE). 



Algorithm ICUBE 

// To pack a list of cubes L into unit bins B = (1, 1, 1). 

1. Let L'l ■«— {g G 1/ : | < s(g) < 1}. 

2. Generate an optimum packing V[ of L) (in polynomial time), with bins as in Lemma 

3. That is, solve CPP® with input list L(. 

3. Let Va be the set of bins B £ V[ having configuration C2 with a cube q £ B with 

5 ( 9 ) < |; let La be the set of cubes packed in Pa- 

4. Let Lb {q € L : 0 < s(g) < |}. 

5. Generate a packing Pab filling the bins in Pa with cubes of Lb (see below). 

6. Let I/i be the set of all packed cubes, and P\ the packing generated for L\. 

7. Let P 2 be the packing of the unpacked cubes of Lb generated by NFDH. 

8. Return the packing Pi VJP^- 
end algorithm 



To generate the packing Vab, in step 5 of algorithm ICUBE, we first partition 
the list Lb into 5 lists, Lb, 3 , Lb a, Lb, 3 , Lb^, Lb, 7 , defined as follows. LB,i = 
{cGLb- < s(c) < i}, z = 3, ... ,6 and Ls, 7 = {c G Lb : s(c) < i}. Then 
we combine the cubes in each of these lists with the packing Pa generated in 
step 3. 
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Now consider the packing of cubes of Lb , 3 into bins of Va- Since we can 
pack 19 cubes of Lb,z into each of these bins, we generate such a packing until 
all cubes of Lb , 3 have been packed, or until there are no more bins in Va- We 
generate similar packings combining the remaining bins of Va with the lists Lb, a, 
Lb , 5 and Lb,q- To pack the cubes of Lb , 7 into bins of Va we consider the empty 
space of the bin divided into three smaller bins of dimensions ( 1 , 1 ,|), 
and (|, |, |). Then use NFDH to pack the cubes in Lb,i into these smaller bins. 
We continue the packing of Lb ,7 using other bins of Va until there are no more 
unpacked cubes of Lb,t, or all bins of Va have been considered. 

Theorem 4. For any instance L o/CPP, we have 

ICUBE(L) < 2.669 • OPT(L) + 7. 

Proof. (Sketch) Let C [ , C '2 and C 3 be the set of bins used in V[ with configurations 
Cl, C2 and C3, respectively. Considering the volume guarantees of C(,C 2 and 
C^, we have #(C() < 1 / 8 + 7 / 27 + 1 . #(^ 2 ) < TT 8 ^(^ 2 ) + and #(C^) < 
^V{C'^) + 1 . 

We call La the set of cubes packed in C 2 , and consider it a critical set {La ■= 
{q £ L : ^ < s{q) < |}). The bins of C '2 are additionally filled with the 

cubes in L'^, defined as Lb, until possibly all cubes oi Lb have been packed 
{Lb ■= {q £ L : 0 < s{q) < |}). We have two cases to analyse. 

Case 1 : All cubes oi Lb have been packed in Vab- 
The analysis of this case is simple and will be omitted. 

Case 2 : There are cubes oi Lb not packed in Vab- 

Note that the volume occupation in each bin with configuration Cl or C3 is at 
least For the bins with configuration C2, we have a volume occupation of |. 
In step 5, the bins with configuration C2 are additionaly filled with cubes oi Lb 
generating a combined packing Vab- 

In this case, all cubes of La have been packed with cubes of Lb- Thus, each 
bin of Vab has a volume ocupation of at least ^ . The reader can verify this fact 
by adding up the volume of these cubes in La and the cubes of Tsy, z = 3, . . . , 6 . 
For bins combining cubes of La with Lb, 7 , we use Theorem 1 to guarantee this 
minimum volume ocupation for the resulting packed bins. Therefore, we have 
an optimum packing of Lx with volume guarantee at least Thus we have 

#{Vx) < OPT(L), and #{Vx) < ^ + 6 . 

The packing V 2 is generated by algorithm NFDH for a list of cubes with size 
not greater than |. Therefore, by Lemma 1, we have #( 7 ^ 2 ) < 27 /qa 
Now, proceeding as in the proof of Theorem 3, we obtain 

ICUBE(L) < a • OPT(L) + 8, 

where a = < 2.669. n 

6 Concluding Remarks 

We have described an online algorithm for CPP that is a specialization of an ap- 
proach introduced by Coppersmith and Raghavan [6] for a more general setting. 
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Our motivation in doing so was to obtain the asymptotic performance bound 
(3.954) of this algorithm, so that we could compare it with the bounds of the 
offline algorithms presented here. 

We have shown a simple offline algorithm for CPP with asymptotic perfor- 
mance bound 3.466. Then we have designed another offline algorithm that is an 
improvement of this algorithm, with asymptotic performance bound 2.669. This 
result can be generalized to fc-dimensional cube packing, for fc > 3, by making 
use of the Theorem 1 and generalizing the techniques used in this paper. Both 
algorithms can be implemented to run in time 0(n log n), where n is the number 
of cubes in the list L. We have also shown that if the instance consists of cubes 
with size greater than 1/3 there is a polynomial exact algorithm. 

We did not find in the literature offlines algorithms for CPP with asymptotic 
performance bound as good as 2.669. 
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Abstract. The Flexible Job Shop Problem is a generalization of the 
classical job shop scheduling problem in which for every operation there 
is a group of machines that can process it. The problem is to assign 
operations to machines and to order the operations on the machines, so 
that the operations can be processed in the smallest amount of time. We 
present a linear time approximation scheme for the non-preemptive ver- 
sion of the problem when the number m of machines and the maximum 
number /r of operations per job are fixed. We also study the preemptive 
version of the problem when m and /i are fixed, and present a linear time 
(2 -I- e)-approximation algorithm for the problem with migration. 

1 Introduction 

The job shop scheduling problem is a classical problem in Operations Research 
[10] in which it is desired to process a set = { Ji, . . . , of n jobs on a group 
M = {1, . . . , m} of m machines in the smallest amount of time. Every job Jj 
consists of a sequence of /i operations 0\j , 02 j , • ■ • , O^j which must be processed 
in the given order. Every operation Oij has assigned a unique machine mjj G M 
which must process the operation without interruption during pij units of time, 
and a machine can process at most one operation at a time. 

In this paper we study a generalization of the job shop scheduling problem 
called the flexible job shop problem [1], which models a wide variety of prob- 
lems encountered in real manufacturing systems [1,13]. In the flexible job shop 
problem an operation Oij can be processed by any machine from a given group 
Mij C M. The processing time of operation Oij on machine k G is pfj. The 
goal is to choose for each operation Oij an eligible machine and a starting time 
so that the maximum completion time Cmax over all jobs is minimized. Cmax is 
called the makespan or the length of the schedule. 
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The flexible job shop problem is more complex than the job shop problem 
because of the need to assign operations to machines. Following the three-field 
a|/3|7 notation suggested by Vaessens [13] and based on that of [4], we denote 
our problem as mlm\chain,op < /ijCmax- In the first held m specifies that the 
number of machines is a constant, 1 specifies that any operation requires at most 
one machine to be processed, and the second m gives an upper bound on the 
number of machines that can process an operation. The second held states the 
precedence constraints and the maximum number of operations per job, while 
the third held specifies the objective function. The following special cases of the 
problem are already NP-hard (see [13] for a survey): 2 1 2\chain,n = 3|C'max, 3 
1 2\chain,n = 2|C'max, 2 1 2\chain,op < 2|C'max- 

The job shop scheduling problem has been extensively studied. The problem 
is known to be strongly NP-hard even if each job has at most three operations 
and there are only two machines [10]. Williamson et al. [14] proved that when 
the number of machines, jobs, and operations per job are part of the input 
there does not exist a polynomial time approximation algorithm with worst 
case bound smaller than | unless P = NP. On the other hand the preemptive 
version of the job shop scheduling problem is NP-complete in the strong sense 
even when m = 3 and /r = 3 [3]. Jansen et al. [8] have designed a linear time 
approximation scheme for the case when m and p are fixed. When m and p are 
part of the input the best known result [2] is an approximation algorithm with 
worst case bound 0([log(m^) log(min{m/r,Pmax})/loglog(rn/r)]^), where p^ax 
is the largest processing time among all operations. 

Scheduling jobs with chain precedence constraints on unrelated par- 
allel machines is equivalent to the flexible job shop problem. For the 
first problem, Shmoys et al. [12] have designed a polynomial-time random- 
ized algorithm that, with high probability, finds a schedule of length at most 
0{{log^ n/ log log n)C^g^^), where C^ax is the optimal makespan. 

In this work we study the preemptive and non-preemptive versions of the 
flexible job shop scheduling problem when the number of machines m and the 
number of operations per job p are fixed. We generalize the techniques described 
in [8] for the job shop scheduling problem and design a linear time approximation 
scheme for the flexible job shop problem. In addition, each job Jj has a delivery 
time Qj. If in a schedule Jj completes its processing at time Cj, then its delivery 
completion time is equal to Cj + qj. The problem now is to And a schedule 
that minimizes the maximum delivery completion time Lmax- We notice that by 
using the same techniques we can also handle the case in which each job Jj has 
a release time rj when it becomes available for processing and the objective is 
to minimize the makespan. 

Our techniques allow us also to design a linear time approximation scheme 
for the preemptive version of the flexible job shop problem without migration. 
No migration means that each operation must be processed by a unique machine. 
So if an operation is preempted, its processing can only be resumed on the same 
machine on which it was being processed before the preemption. Due to space 
limitations we do not describe this algorithm here. We also study the preemptive 
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flexible job shop problem with migration, and present a (2 + e)-approximation 
algorithm for it. The last algorithm handles release and delivery times, and both 
of them produce solutions with only a constant number of preemptions. 



2 The Non-preemptive Flexible Job Shop Problem 



Consider an instance of the flexible job shop problem with release and delivery 
times. Let b® the length of an optimum schedule. For every job Jj, let Pj = 
X^r=i [™n„gMfcj Pij] denote its minimum processing time. Let P = ^j.^j Pj- 
Let rj be the release time of job Jj and qj be its delivery time. We define 
b = C + Pj + tor all jobs Jj, and we let tmax = maxj tj. 



Lemma 1. 



max 



P 

1 tn 
m 



< Lt 



< P + tn 



( 1 ) 



We divide all processing, release, and delivery times by maxjP, tmax}) and 
thus by Lemma 1, 

1 < L^ax < TO + 1, and t^ax < 1- (2) 

We observe that Lemma 1 holds also for the preemptive version of the 
problem with or without migration. Here we present an algorithm for the non- 
preemptive flexible job shop problem that works for the case when all release 
times are zero. The algorithm works as follows. First we show how to trans- 
form an instance of the flexible job shop problem into another instance without 
delivery times. Then we define a set of time intervals and assign operations to 
the intervals so that operations from the same job that are assigned to different 
intervals appear in the correct order, and the total length of the intervals is no 
larger than the length of an optimum schedule. We perform this step by first 
fixing the position of the operations of a constant number of jobs (which we call 
the long jobs), and then using linear programming to determine the position of 
the remaining operations. 

Next we use an algorithm by Sevastianov [11] to And a feasible schedule 
for the operations within each interval. Sevastianov’s algorithm finds for each 
interval a schedule of length equal to the length of the interval plus m/r^Pmax) 
where Pmax is the largest processing time of any operation in the interval. In order 
to keep this enlargement small, we remove from each interval a subset V of jobs 
with large operations before running Sevastianov’s algorithm. Those operations 
are scheduled at the beginning of the solution, and by choosing carefully the set 
of long jobs we can show that the total length of the operations in V is very 
small compared to the overall length of the schedule. 



2.1 Getting Rid of the Delivery Times 

We use a technique by Hall and Shmoys [6] to transform an instance of the 
flexible job shop problem into another with only a constant number of different 
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delivery times. Let ^max be the maximum delivery time and let £ > 0 be a 
constant value. The idea is to round each delivery time down to the nearest 
multiple of f^max to get at most 1 + 2/s distinct delivery times. Next, apply 
a (1 + £/2)-approximation algorithm for the flexible job shop problem that can 
handle 1+2/e distinct delivery times (this algorithm is described below). Finally, 
add l^max to the completion time of each job; this increases the length of the 
solution by |<7max- The resulting schedule is feasible for the original instance, 
so this is a (1 + e)-approximation algorithm for the original problem. In the 
remainder of this section, we restrict our attention to the problem for which the 
delivery times qi, ...,qn can take only x < 1 + | distinct values, which we denote 
by > ... > 6^. 

The delivery time of a job can be interpreted as an additional delivery opera- 
tion that must be processed on a non-hottleneck machine after the last operation 
of the job. A non-bottleneck machine is a machine that can process simultane- 
ously any number of operations. Moreover, every feasible schedule for the jobs 
can be transformed into another feasible schedule, in which all delivery oper- 
ations finish at the same time, without increasing the length of schedule: simply 
shift the delivery operations to the end of the schedule. Therefore, we only need 
to consider a set = {di , ..., d^} of x different delivery operations, where di has 
processing time Si. 



2.2 Relative Schedules 

Assume that the jobs are indexed so that Pi > P 2 > ■■■ > Pn- Let C d J he the 
set formed by the first k jobs, i.e., the k jobs with longest minimum processing 
time, where A: is a constant to be defined later. We call C the set of long jobs. An 
operation from a long job is called a long operation, regardless of its processing 
time. Let S = he the set of short jobs. 

Consider any feasible schedule for the jobs in J7. This schedule assigns a ma- 
chine to every operation and it also defines a relative ordering for the starting 
and finishing times of the operations. A relative schedule R for C is an assign- 
ment of machines to long operations and a relative ordering for the starting and 
finishing times of the long operations and the delivery operations, such that there 
is a feasible schedule for that respects R. This means that for every relative 
schedule R there is a feasible schedule for that assigns the same machines as 
R to the long operations and that schedules the long and the delivery operations 
in the same relative order as R. Since there is a constant number of long jobs, 
then there is also a constant number of different relative schedules. 

Lemma 2. The number of relative schedules is at most m^*(2(/ifc + x))-- 

If we build all relative schedules, one of them must be equal to the relative 
schedule defined by some optimum solution. Since it is possible to build all 
relative schedules in constant time, we might assume without loss of generality 
that we know how to find a relative schedule R such that some optimum schedule 
for J respects R. 
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Fix a relative schedule R as described above. The ordering of the start- 
ing and finishing times of the operations divide the time line into intervals 
that we call snapshots. We can view a relative schedule as a sequence of snap- 
shots M{1), M{2), . . . ,M{g), where M(l) is the unbounded snapshot whose right 
boundary is the starting time of the first operation according to R, and M{g) 
is the snapshot bounded on the right by the finishing time of the delivery oper- 
ations. The number of snapshots g is at most 2g,k -I- x -I- 1 because the starting 
and finishing times of every operation might bound a snapshot. 

2.3 Scheduling the Small Jobs 

Given a relative schedule R as described above, to obtain a solution for the 
flexible job shop problem we need to schedule the small operations within the 
snapshots defined by R. We do this in two steps. First we use a linear program 
LP{R) to assign small operations to snapshots and machines, and second, we 
find a feasible schedule for the small operations within every snapshot. 

To formulate the linear program we need first to define some variables. For 
each snapshot M{£) we use a variable ti to denote its length. For each Jj G S we 
define a set of decision variables with the following meaning: 

= / iff for all g = 1, . . . , /i, an / fraction of the g-th operation 
of job .Jj is completely scheduled in the i^-th snapshot and on machine Sq. 

Let aj be the snapshot where the delivery operation of job Jj starts. For 
every variable we need 1 < < i 2 < ■ • • < V < to ensure 

that the operations of Jj are scheduled in the proper order. Let Aj = {(i,s) | 
i = (ii, . . . ,z^) 1 < < . . . < < Qfj, s = (si, .. .,Sf,) Sq G Mqj and no long 

operation is scheduled by R at snapshot iq on machine Sq, for all g = 1, ... 

The load on machine h in snapshot M{£) is the total processing time of 
the operations from small jobs assigned to h during M(£), i.e., 

( 3 ) 

.JjeS {i,s)eAj , 9=1 

lq—l,Sq — ri 

where iq and Sq are the g-th components of tuples i and s respectively. 

For every long operation Oij let and Pij be the indices of the first and last 
snapshots where the operation is scheduled. Let pij be the processing time of 
long operation Oij according to the machine assignment defined by the relative 
schedule R. We are ready to describe the linear program LP{R) that assigns 
small operations to snapshots. 



Minimize 



s.t. (1) 


II 

‘eg 4 

jW 


for all Jj G £, z = 1, . . . , /i. 


(2) 


ELa, te = 


for all delivery operations dj 


(3) 


E(i,s)GA„- — f > 


for all Jj G S, 


(4) 


Ll,h < tf, 


for all £ = 1, ... ,g, h = 1, . . . 


(5) 


IV 

o 


for all £ = 1, . . . , g. 
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(6) Xjis > 0 , for all Jj G S, {i, s) G Aj. 

Lemma 3. An optimum solution of LP{R) has value no larger than the length 
of an optimum schedule S* that respects the relative schedule R. 

One can solve LP{R) optimally in polynomial time and get only a constant 
number of jobs with fractional assignments, since a basic feasible solution of 
LP{R) has at most kpL + n — k + mg + x variables with positive value. By 
constraint (3) every small job has at least one positive variable associated with 
it, and so there are at most mg + kpL + x jobs with fractional assignments. We 
show later how to get rid of any constant number of fractional assignments by 
only slightly increasing the length of the solution. 

The drawback of this approach is that solving the linear program might take 
a very long time. Since we want to get an approximate solution to the flexible 
job shop problem it is not necessary to find an optimum solution for LP(R), an 
approximate solution would suffice. 

2.4 Approximate Solution of the Linear Program 

A convex block-angular resource sharing problem has the form: 

min / A ^ fi{x^) < A, for alH = 1, . . . , A, and x^ G , k = 1,. . . ,K 

[ fc=i 

where /* : — >■ are N non-negative continuous convex functions, and 

B^ are disjoint convex compact nonempty sets called blocks, 1 < k < K. The 
Potential Price Directive Decomposition Method of Grigoriadis and Khachiyan 
[5] can find a (1 -I- p)-approximate solution to this problem for any p > 0. This 
algorithm needs 0{N{p~‘^ In p~^ -b In A) (Ain ln( A/p) -b KF)) time, where F is 
the time needed to find a p-approximate solution to the following problem on any 
block B^, for some vector (pi, . . . ,piq) G 5?^: min ^ |. 

We can write LP{R) as a convex block-angular resource sharing problem as 
follows. First we guess the value s of an optimum solution for LP{R), and add 
the constraint: < s to the linear program. Note that s < to -b 1. Then 

we replace constraint (4) by constraint (4'), where A is a non-negative value: 

(4’) Le^h — -b TO -b 1 < A, for all ^ = 1, . . . , p, h = 1, . . . ,m. 

This new linear program, that we denote as LP{R,s,X), has the above block- 
angular structure. The blocks Bj = {xjis \ constraints (3) and (6) hold}, are 
(TOpj^'-dimensional simplicies. The block = {ti \ ^ ^ and con- 

straints (1),(2), and (5) hold} has also constant dimension. Let fe^h = Li^h — te + 
TO-b 1. Since te < s < m-\-l, these functions are non-negative. Each block Bi has 
constant dimension, and so the above block optimization problem can be solved 
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in constant time. Therefore the algorithm of [5] finds a (1 + p)-approximate so- 
lution for LP{R, s, A) in 0{n) time for any value p > 0. This gives a feasible 
solution of LP{R, s, (m -I- 1 -I- p')) for p = p' j {m + 1). 

Let be the length of an optimum schedule and assume that i? is a 

relative schedule for C in an optimum schedule. We can use binary search on the 
interval [1, 1 + m] to find a value s < (1+ f)Lmax such that LP{R, s, {m+l+ p')) 
has a solution for p' = ^- This search can be performed in 0(log( j log m)) 
iterations by performing the binary search only on the following values, 

(l+|),(l+|r,...,(l+|)^-\m+l (4) 

where b is the smallest integer such that (1-|-£/8 )*'>to-|- 1. Thus, b < ln(m -I- 
l)/ln(l -I- e/8) -I- 1 = 0(y logm), since ln(l + e/8) > To see that this 

search yields the desired value for s, note that there exists a nonnegative integer 
i < b such that L^ax G [(1 + f)*? (1 + ^ud therefore with the above search 

we find a value s < (1 J- which LP{R, s,m + 1 + p') has a feasible 

solution. Linear program LP{R, s,m + 1 + p') assumes that the length of each 
snapshot is increased by p' , and therefore the total length of the solution is 
(1 -I- |)Lmax + 9P' < (1 + |)Tmax- 

Lemma 4. A solution for LP{R,s,m J- 1 J- p'), with s < (1 J- and 

p' = of value at most (1 + |)LJ/ax be found in linear time. 

By using a similar technique as in [8] we can modify any feasible solution for 
LP{R, s,m + 1 + p') to get a new feasible solution in which all but a constant 
number of variables Xjis have integer values. Moreover we can do this rounding 
step in linear time. 

Lemma 5. A solution for LP{R, s,m+l + p') can he transformed in linear time 
into another solution for LP{R, s,m + 1 + p') in which the set T of jobs that 
still have fractional assignments after the rounding procedure has size \T\ < mg. 

2.5 Generating a Feasible Schedule 

To get a feasible schedule from the solution of the linear program we need to 
remove all jobs T that received fractional assignment. These jobs are scheduled 
sequentially at the beginning of the schedule. 

For every operation of the small jobs, consider its processing time according 
to the machine selected for it by the solution of the linear program. Let V be the 
set formed by the small jobs containing at least one operation with processing 
time larger than r = . Note that |V| < m (m+i) remove 

from the snapshots all jobs in V and place them sequentially at the beginning of 
the schedule. 

Let 0(£) be the set of operations from small jobs that remain in snapshot 
M(i). Let Pmax{() be the maximum processing time among the operations in 
0{t). Every snapshot M{t) defines an instance of the job shop problem, since 
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the solution of the linear program assigns a unique machine to every operation. 
Hence we can use Sevastianov’s algorithm [11] to find in time a 

feasible schedule for the operations 0{£); this schedule has length at most = 
ti + p' + p^mpmax{£)- We must increase the length of every snapshot M{€) to 
if, to accommodate the schedule produced by Sevastianov’s algorithm. Summing 
up all these enlargements, we get: 

Lemma 6. 

9 £ £ 

'^pi^rnpmaxi.i) < P^mgr = ^< g^max- (5) 

t=\ 

The total length of the snapshots M{aij), . . . ,M{f3ij) that contain a long op- 
eration Oij might be larger than pij . This creates some idle times on machine niij . 
We start operations Oij for long jobs £ at the beginning of the enlarged snap- 
shot M{aij). The resulting schedule is clearly feasible. Let P{J') = 
be the total processing time of all jobs in some set J' d J when the operations 
of those jobs are assigned to the machines with the lowest processing times. 

Lemma 7. A feasible schedule for the jobs J of length at most (1 -I- |e)£max + 
P{T\J V) can be found in O(n^) time. 

We can choose the number k of long jobs so that P{iF U V) < f LJnax- 
Lemma 8. [7] Let {di, d 2 , ■ ■ ■ , dn} be positive values and Xj=i ^ Q 

be a nonnegative integer, a > 0 , and n > {q+ . There exists an integer k 

such that dk+i -I- ... -I- du+qk < otm and k < {q + 1) . 

Let us choose a = and q = rn{2p -|- x + !)• By Lemma 

5, U V| < mg + g < gp, gy Lemma 8 it is possible to choose a 

value k < (g -I- 1)^“^ so that the total processing time of the jobs in U V is 
at most I < |£max- This value of k can clearly be computed in constant time. 
We select the set £ of large jobs as the set consisting of the k jobs with largest 
processing times Pj = Xf=i[minseMfc, pfj]- 

Lemma 9. 

P{PUV)<^-Ll,,,,. (6) 

Theorem 1. For any fixed m and p, there is an algorithm for the flexible job 
shop scheduling problem that computes for any value £ > 0, a feasible schedule 
of length at most (1 -I- e)£Jj,ax 0{n) time. 

Proof. By Lemmas 7 and 9, the above algorithm finds in 0(n^) a schedule of 
length at most (1 -I- ^e)LJ„ax- This algorithm can handle 1 -b f distinct delivery 
times. By the discussion at the beginning of Section 3.1 it is easy to modify the 
algorithm so that it handles arbitrary delivery times and it yields a schedule of 
length at most (1 -I- £)L’^„^y.. For every fixed to, p, and e, all computations can 
be carried out in 0{n) time, with exception of the algorithm of Sevastianov that 
runs in 0{n^) time. The latter can be sped up to get linear time by “glueing” 
pairs of small jobs together as described in [8]. □ 
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3 Preemptive Flexible Job Shop Problem with Migration 

Let t a nonnegative variable, with t > 5i, that denotes the length of a schedule 
(with delivery times). Consider y time intervals defined as follows, [0,t — Ji], [t — 
61,1 — 62 ], [t — 6 ^- 1 , t — <5^] where i5i > . . . are the delivery times. First we 
ignore the release times and select the machines and the time intervals in which 
the operations of every job are going to be processed. To do this we define a 
linear program LP (similar to that of Sect. 2.3) that minimize the value of t. 
The optimum solution of the LP has value no larger than Again, by using 

the Logarithmic Potential Price Directive Decomposition Method [5] and the 
rounding technique of [8], we can compute in linear time a (1 + |)-approximate 
solution S of the LP such that the size of the set T of jobs that receive fractional 
assignments is bounded by my. 

Let V denote the set of jobs from J\T for which at least one operation 
has processing time greater than > where i is the value of S. Let 

L = pyjV and S = J\C. According to S, find a feasible schedule as for the jobs 
from S applying Sevastianov’s algorithm. We use the algorithm of Sevastianov to 
find a schedule for the operations assigned to each time interval. The maximum 
delivery completion time (when release times are ignored) of as is at most (1 + 
^)^max- By adding release times the length of as is at most (2 + since 

the maximum release time cannot be more than L^^^. Again, the algorithm of 
Sevastianov can be sped up to take 0(n) time, and computing the schedule as 
takes linear time. 

Now, we ignore the delivery times for jobs from C (they are considered later). 

3 2 

We note that the cardinality of set £ is bounded by 0{ ^ ^ ). As we did for the 
delivery times, the release times can be interpreted as additional operations of 
jobs that have to be processed on a non-bottleneck machine. Because of this 
interpretation, we can add to the set Oc of operations from £ a set 7^ of release 
operations Oqj with processing times Vj. Each job Jj € £ has to perform its 
release operation Ogj on a non-bottleneck machine at the beginning. A relative 
order R is an ordered sequence of the starting and finishing times of all operations 
from Oc U TZ, such that there is a feasible schedule for £ that respects R. The 
ordering of the starting and finishing times of the operations divide the time line 

4 2 

into intervals. We observe that the number of intervals g is bounded by ^ ). 
Note that a relative order is defined without assigning operations of long jobs to 
machines. 

For every relative order R we define a linear program to assign (fractions 
of) operations to machines and intervals that respects R. We build all relative 
orders R and solve the corresponding linear programs in constant time. At the 
end we select a relative schedule R* with the smallest solution value. We can 
show that the value of this solution is no larger than the length of an optimum 
preemptive schedule for £. This solution is in general not a feasible schedule for 
£ since the order of fractions of operations within an interval could be incorrect. 
However the set of operations assigned to each interval gives an instance of the 
preemptive open shop problem, which can be solved exactly in constant time [9]. 
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This can be done without increasing the length of each interval and with in total 
only a constant number of preemptions (at most 0{rn?) preemptions for each 
interval) . By adding the delivery times the length of the computed schedule CT£ 
is at most The output schedule is obtained by appending as after ac- 

Theorem 2. For any fixed m, fj,, and e > 0, there is a (2 + s) -linear-time 
approximation algorithm for the preemptive flexible job shop scheduling problem 
with migration. 

References 

1. P. Brandimarte, Routing and scheduling in a flexible job shop by tabu search, 
Annals of Operations Research, 22, 158-183, 1993. 

2. L.A. Goldberg, M. Paterson, A. Srinivasan, and E. Sweedyk, Better approximation 
guarantees for job-shop scheduling. Proceedings of the 8th Symposium on Diserete 
Algorithms (SODA 97), 599-608. 

3. T. Gonzales and S. Sahni, Flowshop and jobshop schedules: complexity and ap- 
proximation, Operations Researeh 26 (1978), 36-52. 

4. R.L. Graham, E.L. Lawler, J.K. Lenstra, and A.H.G. Rinnoy Kan, Optimiza- 
tion and approximation in deterministic sequencing and scheduling, Ann. Discrete 
Math. 5 (1979), 287-326. 

5. M.D. Grigoriadis and L.G. Khachiyan, Coordination complexity of parallel price- 
directive decomposition. Mathematics of Operations Research 21 (1996), 321-340. 

6. L.A. Hall and D.B. Shmoys, Approximation algorithms for constrained scheduling 
problems. Proceedings of the IEEE 30th Annual Symposium on Eoundations of 
Computer Science (FOGS 89), 134-139. 

7. K. Jansen and L. Porkolab, Linear-time approximation schemes for scheduling 
malleable parallel tasks. Proceedings of the 10th Annual ACM-SIAM Symposium 
on Discrete Algorithms, (SODA 99), 490-498. 

8. K. Jansen, R. Solis-Oba and M.I. Sviridenko, A linear time approximation scheme 
for the job shop scheduling problem. Proceedings of the Second International Work- 
shop on Approximation Algorithms (APPROX 99), 177-188. 

9. E. L. Lawler, J. Labetoulle, On Preemptive Scheduling of Unrelated Parallel Pro- 
cessors by Linear Programming, Journal of the ACM, vol. 25, no. 4, pp. 612-619, 
October 1978. 

10. E.L. Lawler, J.K. Lenstra, A.H.G. Rinnooy Kan, and D.B. Shmoys, Sequencing 
and scheduling: Algorithms and complexity, in: Handbook in Operations Research 
and Management Science, Vol. 4, North-Holland, 1993, 445-522. 

11. S.V. Sevastianov, Bounding algorithms for the routing problem with arbitrary 
paths and alternative servers. Cybernetics 22 (1986), 773-780. 

12. D.B. Shmoys, C. Stein, and J. Wein, Improved approximation algorithms for shop 
scheduling problems, SIAM Journal on Computing 23 (1994), 617-632. 

13. R.J.M. Vaessens, Generalized job shop scheduling: complexity and local search, 
Ph.D. thesis (1995), Eindhoven University of Technology. 

14. D. Williamson, L. Hall, J. Hoogeveen, C.. Hurkens, J.. Lenstra, S. Sevastianov, 
and D. Shmoys, Short shop schedules, Operations Researeh 45 (1997), 288-294. 




Emerging Behavior as Binary Search Trees Are 
Symmetrically Updated 



Stephen Taylor 

College of the Holy Cross, Worcester MA 01610-2395, USA, 
staylorSholycross . edu 



Abstract. When repeated updates are made to a binary search tree, 
the expected search cost tends to improve, as observed by Knott. For 
the case in which the updates use an asymmetric deletion algorithm, the 
Knott effect is swamped by the behavior discovered by Eppinger. The 
Knott effect applies also to updates using symmetric deletion algorithms, 
and it remains unexplained, along with several other trends in the tree 
distribution. It is believed that updates using symmetric deletion do not 
cause search cost to deteriorate, but the evidence is all experimental. 
The contribution of this paper is to model separately several different 
trends which may contribute to or detract from the Knott effect. 



1 Background 

A binary search tree (BST) is a tree structure with a key value stored in each 
node. For each node, the key value is an upper bound on the values of keys 
in the left subtree, and a lower bound on keys in the right subtree. If there 
are no duplicate keys, a search of the tree for any given key value involves 
examining nodes in a single path from the root. An insertion into the tree is 
made by searching for a candidate key, then placing it as a child of the last node 
reached in the search, so that an inserted key is always a leaf. Deletions are more 
complicated, and use one of the algorithms described in section (2.2.) 

When a BST with n nodes is grown by random insertions (RI) with no 
deletions, the average search cost is 0(log n), or equivalently, the total pathlength 
from the root to every node in the tree (this is the internal pathlength, or IPL) 
is O(nlogn). 

An update consists of deleting the node with some particular key-value, and 
inserting another, either with the same key, or a different key. 

Culberson [CM90] refers to the leftward-only descendents of the root as the 
backbone, and the distances between key-values of the backbone as intervals. 
[Bri86] calls the backbone and the corresponding rightward-only descendents of 
the root the shell. The length of the shell is the pathlength from smallest to 
largest key. Shell intervals are defined by the key-values of the shell nodes. 

1.1 Related Work 

When repeated updates are made to a binary search tree, the expected search 
cost tends to improve. This Knott effect was first reported in [Kno75] . It turns out 
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that for updates using asymmetric [Hib62] deletion, the Knott effect is swamped 
by the Eppinger effect. In [Epp83] Eppinger observed that after updates, 

the tree of size n has expected search time greater than O(lnn). There is a 
striking early improvement and search costs drop to about 92% of initial values, 
and then, after about n^/2 iterations, the cost begins to rise. It levels out after 
about iff iterations. For a tree with 128 nodes, the final search cost levels out 
to be about the same as for a tree built with insertions only; smaller trees, as 
conjectured by Knott, fare better; but larger trees do worse. For trees of 2048 
nodes, the asymptotic search cost is about 50% greater than for an RI BST. 

Culberson [CM90] has given a model which explains this for updates in which 
the item removed is always the item re-inserted, which he calls the Exact Fit 
Domain (EFD) model. Culberson’s model is based on directed random walks, 
and finds that the expected search cost is 0{ffn). We call similar undirected 
random walks in trees updated with symmetrical deletion the Culberson effect, 
although the time scales and results are different. 

The Knott effect remains unexplained; and it is not the only unexplained 
behavior. Simulations reported in Evans and Culberson [EC94] for two symmet- 
ric update algorithms show a reduced average pathlength, as predicted by the 
Knott effect, but also that pathlengths from the root to the largest and smallest 
leaves (and perhaps to some other, unmeasured subset of nodes) were 1.2 to 1.3 
times longer than would be expected in a random binary search tree. We call 
this the Evans effect. 

[JK78] demonstrate analytically that Knott’s conjecture is correct for trees 
with three nodes. [BY89] does the same for trees with four nodes. [Mes91] ana- 
lyzes an update using symmetric deletion for the tree of three nodes. 

Martinez and Roura [MR98] provide randomized algorithms which maintain 
the distribution of trees after update to be the same as a RI binary search tree. 
Their algorithms are not susceptible to the breakdown caused by sorted input, 
nor to the Eppinger or Culberson effects. However, they are also immune to the 
Knott effect, and thus miss any improvements in search times it might provide. 
There are several open questions about update with symmetric deletion. 

1. Does the Knott conjecture hold if it is revised for symmetric deletion? 

2. If so, why? Is there a model which explains why stirring up the tree should 
result in shorter internal path lengths? The effect is apparent even in Hibbard 
deletions, before it is overwhelmed by the skewing of the tree. 

3. Is there a long-term degeneration in the tree? If so, it must be over a much 
longer term than the Hibbard deletion degeneration, because Eppinger’s and 
Culberson’s simulations did not detect it. 

2 Update Methodology 

2.1 Exact Fit Domain Model 

Culberson [CM89,CM90] proposed the unrealistic Exact Fit Domain (EFD) 
model to simplify analysis. The assumption is there are only n keys possible 
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and no duplicates in the tree, so that when an update occurs, the new key must 
be the same as the one which was just deleted. This has the effect of localizing 
the effects of update operations, making them easier to analyze. We assert with- 
out mathematical justification that the EFD model gives qualitatively similar 
results to a more realistic random update model. Since repeated insertions result 
in relatively well-understood behavior if they are not in the neighborhood of a 
deletion, we claim that the EFD model simply telescopes the effect of separated 
deletions and insertions in the same area. Time scales for emerging behavior may 
be changed by the EFD model, but perhaps not other measurements. In support 
of this suggestion, note the graphs of fig. (1.) [Grafting deletion is defined below.] 




Updates 

(a) Comparing Shell Sizes 




Updates 

(b) Comparing IPL 



Fig. 1. Simulations with and without EFD. 



2.2 Deletion Algorithms 

Hibbard’s Asymmetrical Deletion Algorithm When Hibbard formulated 
his deletion algorithm for binary trees in [Hib62] , he was aware of the asymmetry. 
His algorithm has two steps: 

1. If the right subtree of the node to be deleted is not empty, replace the key of 
the deleted node with its successor, the left-most node in the right subtree. 
Then delete the successor. 

2. If the right subtree of the node to be deleted is empty, replace the node with 
its left subtree. 

There are two, different, asymmetries: the deleted node is preferentially updated 
from the right subtree; and the case when the right subtree is empty doesn’t 
have a matching simple case when the left subtree is empty. 
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Hibbard proved that his asymmetric deletion did not change the distribution 
of tree shapes. He assumed that this meant that a symmetric algorithm was 
unnecessary. 

Symmetrical Grafting Deletion This algorithm is a combination of the Hi- 
bbard deletion algorithm and its mirror image. Whether the right-favored or 
left-favored version of deletion is used is of equal probability, so the algorithm is 
symmetrical. In our simulations, we use simple alternation rather than a random 
number generator to decide which to use, but check for empty subtrees before 
considering the successor or predecessor key. We call this grafting deletion be- 
cause the subtree is grafted into the place of the deleted node when possible. 
Most published symmetric deletion algorithms are variants on grafting deletion. 
Simulations, for example fig. (2) show that one property of grafting deletion is 
that zero-size subtrees rapidly get less common than in RI BSTs. 




Updates 

Fig. 2. Zero-size left-subtrees of root in 256 node BST. 



Symmetrical Non-Grafting Deletion This is a symmetric deletion algorithm 
which lacks an optimization for empty subtrees. The algorithm replaces a deleted 
node with its successor or predecessor in the tree, unless there is none, in which 
case it is replaced with the predecessor or successor. If there is neither predecessor 
nor successor (the node to be deleted is a leaf) the node is simply removed. 

The algorithm alternates between favoring predecessors and successors, so 
it is symmetrical. Because it lacks an optimization to reduce the height of the 
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tree by grafting a subtree nearer the root when the other subtree is empty, we 
might expect that it would produce a distribution of binary search trees which 
includes rather more zero-sized subtrees than algorithms which include such an 
optimization. (These include the asymmetrical Hibbard deletion algorithm.) 

This algorithm is easier to analyze using a Markov chain, because the state- 
space of trees of size n can be described by a single variable, the size of the left 
subtree. In the particularly easy case of the Exact Fit Domain, in which the 
replacement key in an update is always the same as the key deleted, the size of 
the subtree can change only if the root is deleted, and only by one. 

Assume the BST has n nodes, and be the probability that the left subtree 
has k nodes. When the root is deleted for time t > 0, we have for each t the 
n simultaneous equations (here we use Iverson notation [GKP89]: [P]{term) 
evaluates to term if P is true, otherwise to zero:) 



TTfc.t = ( 1 - T ) ( ^ ) + [^ < ^] ( T:: ) 7Tfc+i,t-i (1) 



2n 



and assuming that there is a steady state, we can rewrite this as 
1 ' 



'^k,oo — ( 1 



n 






[k > 0] TTfe-poo +[k <n] TTfc+yoo (2) 



With the additional equation TTk,oo = 1 we can solve the system to find 



1 



^0,oo — ’^n— 1,00 — 



2(n- 1) 

TTfc oo = — [0 < fc < n - 1] 
n — 1 



( 3 ) 

( 4 ) 



3 Emerging Behavior 

3.1 The Knott Effect 



The Knott effect is the observed tendency of a binary search tree to become 
more compact. That is, after a number of deletion and insertion operations are 
performed on a random binary search tree, the resulting trees have smaller IPL 
and therefore smaller search times. Knuth speculates that this effect may be due 
to the tendency of (grafting) delete operations to remove empty subtrees. 

A RI BST has one of the two worst possible keys at the root of any subtree 
with probability 2/ 1 size of subtree]. As a result of updates it evolves toward a 
steady state in which the probability of zero subtrees is smaller. For the case of 
update with non-grafting deletion, in the steady state, every subtree size except 
zero and the largest possible is equally probable, and those two sizes are half as 
likely as the others, as we have shown in eq. (3) and (4) 

If we make the assumption that subtrees have the same distribution as the 
root, this leads naturally to a recurrence for the IPL of such a tree. 



fn = 



n = 0 U n = I 



n-l+ + 2 Yh=i ;wt ^ 1 



"-2 ^ 



( 5 ) 
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Fig. 3. Internal Pathlength declines as tree updated. 



fn 



0 

f _1_ fn,-2 
Jn-1 -I- 



n = 0 U n = 1 



2n— 3 
n—1 



n> 1 



(6) 



This can be evaluated numerically and compared with the corresponding recur- 
rence for an RI BST. The comparison shows that for very large values of n, /„ 
grows quite close to IPL„; only for small to intermediate values do they diverge. 



3.2 The Evans Effect 



The Evans effect is reported in [EC94] for search trees updated with symmetric 
deletion algorithms. They report shells which are 1.2 to 1.3 times as long as those 
of a RI BST. Presumably there are subtree shells to which the effect would also 
apply, but clearly not to every path from the root, since the same simulations 
also showed a Knott effect reduction in average pathlength. Figure (4) shows the 
Evans effect. Note that the Evans effect doesn’t hold for non-grafting deletions; 
the shell size (which is after all, the sum of two paths) follows the IPL down. For 
grafting deletions, the shell size gradually rises. This suggests that the Evans 
effect might be due to the grafting of subshell backbone unto the shell. 

We can easily compute the expected size of the initial size of the shell in a 
RI BST. By symmetry, the size of the shell should be twice the length of the 
backbone, and this turns out to have a simple recurrence. 
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Updates 

Fig. 4. Changes in shell size as tree updated. 



A tree with one node has a backbone length of zero. A tree with n nodes has 
a left-subtree of size k, 0 < k < n with probability and so 



b 



n 



0 



E ti — 1 
Z=1 



n = 



i+k 



n > 



1 

1 , 



( 7 ) 



which has the solution — 1«7— l-l-lnn. The expected size of a RI 

BST shell is then 



U(shell) = 27 - 2 -h 2 In n « 2(ln n) - 0.845568670 (8) 



The root of a n-node tree which has evolved to a steady state with one-level 
deletion will have its left or right subtree empty with probability 2 {n-i) 
subtrees of size fc, 0<fc<n— 1 with probability . This leads to a recurrence 
for the length of the backbone: 



Cn — 



0 

2(«-l) 



n = 1 



y^n-2 1+a „ > 1 
Z^i=l „_i ^ 



This doesn’t solve so quickly or easily, but the equivalent form 



Cn+1 




Cn + 



Cn— 1 



2n 



1 

n 



(9) 



(10) 



suggests that c„ is smaller but asymptotically grows at almost the same rate as 
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Similarly, according to our observation of multi-level delete, the left subtree 
evolves to be almost never empty. So a recurrence for the backbone for such trees 
can neglect the zero case. 






( 11 ) 



3.3 The Culberson Effect 

The Culberson Effect, as explained in [CM89] and [CM90] is the tendency of 
interval endpoints along the shell of the binary search tree to engage in a random 
walk as time passes. Collisions between endpoints cause adjacent intervals to 
combine, so that the subsequent expected position and size of the resulting 
coalesced interval differs from the expected position and size of either of the two 
original intervals. 

In Culberson’s formulation for asymmetric update, the random walk is di- 
rected; in the case of symmetric deletion, the random walk is undirected, and 
therefore the effect is more subtle. Figure (5a) shows interval sizes near the root 
as they evolve with one-level-deletion. Figure (5b) illustrates them for grafting 
deletion. As each node is deleted (which may occur on any update with a prob- 




ra.nk nf rnnt. • 



root left child 
leftmost grandchild • 
leftmost great grandchild • 



100000 

Updates 



(b) grafting updates 



Fig. 5. Subtree sizes on shell grow as tree updated. 



ability of 1/n) the key value for that node will take on the value of either the 
predecessor or the successor node, that is, move either right or left (grafting dele- 
tion may cause the keyvalue to skip over several intermediate keys.) After the 
deletion, one end of the interval defined by the node has moved. The expected 
position of the node has not changed, but the expected distance from the initial 
position is one step. 
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An interval is identified by its relative position in the shell, not its bounds. 
Thus, we can speak of the movement of an interval when deletion of one of its 
endpoint nodes causes a new key value to appear in the at one end of the interval. 
This isn’t quite Brownian motion, since colliding intervals coalesce instead of 
rebounding, but after ‘long enough’ one might expect the intervals nearer the 
root in the shell to be pushing outward. 

There are several special cases. When a shell node deletion occurs through 
key replacement by the key of its successor or predecessor, which will usually 
not be a shell node, the interval has moved right (left.) However, if the successor 
(predecessor) is a shell node with an empty left (right) subtree, there is now one 
fewer node on the shell, and an interval has disappeared. 

Following Culberson, we find that the endpoint of an interval moves either left 
or right when it is (symmetrically) updated; that is, the key value which defines 
the endpoint of an interval changes either up or down as the backbone node 
which holds it is updated. If there were no interval collisions, the expected value 
of the key would stay constant, while its variance would increase linearly with the 
number of updates. Since the probability that an update to the tree will involve 
an endpoint is 1/n, the expected excursion of a key value is 0(i/updates/n). 

But the size of an interval is bounded below by one. The interval would cease 
to exist if its endpoints collided. So the expected size of (remaining) intervals 
will increase as the tree is updated. This effect is clearly visible in fig. (5.) It is 
much more dramatic for non-grafting deletion, perhaps because in order for a 
collision to take place the lower endpoint must have a zero-sized subtree, and 
the grafting deletion algorithm prunes the population of zero-sized subtrees. 

The Culberson effect should slightly increase IPL. 



4 Recapitulation 



Two unrealistic frameworks, Exact Fit Domain update, and non-grafting deletion 
are being used to begin understanding three effects in the evolution of binary 
search trees under update. 

Tentatively it appears that the Knott effect may be less significant for a 
large BST; the effect of fewer zero-size subtrees is predicted to disappear with 
non-grafting deletion for trees of 100000 or more nodes. 

Simulations show that the Culberson effect is still increasing after nT dele- 
tions. The fact that non-grafting deletion has a stronger Culberson effect needs 
to be accounted for in modeling. Notice that zero-length subtrees, which figure in 
the disappearance of intervals in the Culberson model, become quite rare when 
grafting deletes are used, but are relatively common with non-grafting deletes. 

What are the implications of the Evans effect for total IPL? Shell paths seem 
to be shorter than average paths in the tree, even after Evans stretching, and we 
would expect that shell lengths would be a less important contributor to IPL in 
a larger BST. 
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Abstract. We present a very simple algorithm for the Least Common 
Ancestors problem. We thus dispel the frequently held notion that opti- 
mal LCA computation is unwieldy and unimplementable. Interestingly, 
this algorithm is a sequentialization of a previously known PRAM algo- 
rithm. 



1 Introduction 

One of the most fundamental algorithmic problems on trees is how to find the 
Least Common Ancestor (LCA) of a pair of nodes. The LCA of nodes u and v 
in a tree is the shared ancestor of u and v that is located farthest from the root. 
More formally, the LCA Problem is stated as follows: Given a rooted tree T, 
how can T be preprocessed to answer LCA queries quickly for any pair of nodes. 
Thus, one must optimize both the preprocessing time and the query time. 

The LCA problem has been studied intensively both because it is inherently 
beautiful algorithmically and because fast algorithms for the LCA problem can 
be used to solve other algorithmic problems. 

In [HT84], Harel and Tarjan showed the surprising result that LCA queries 
can be answered in constant time after only linear preprocessing of the tree 
T. This classic paper is often cited because linear preprocessing is necessary 
to achieve optimal algorithms in many applications. However, it is well under- 
stood that the actual algorithm presented is far too complicated to implement 
effectively. In [SV88], Schieber and Vishkin introduced a new LCA algorithm. 
Although their algorithm is vastly simpler than Harel and Tarjan’s — indeed, 
this was the point of this new algorithm — it is far from simple and still not 
particularly implementable. 

The folk wisdom of algorithm designers holds that the LCA problem still 
has no implementable optimal solution. Thus, according to hearsay, it is better 
to have a solution to a problem that does not rely on LCA precomputation if 
possible. We argue in this paper that this folk wisdom is wrong. 

In this paper, we present not only a simplified LCA algorithm, we present 
a simple LCA algorithm! We devise this algorithm by reengineering an existing 
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complicated LCA algorithm: in [BBG+89] a PRAM algorithm was presented 
that preprocesses and answers queries in 0(a(n)) time and preprocesses in linear 
work. Although at first glance, this algorithm is not a promising candidate for 
implementation, it turns out that almost all of the complications are PRAM 
induced: when the PRAM complications are excised from this algorithm so that 
it is lean, mean, and sequential, we are left with an extremely simple algorithm. 

In this paper, we present this reengineered algorithm. Our point is not to 
present a new algorithm. Indeed, we have already noted that this algorithm has 
appeared as a PRAM algorithm before. The point is to change the folk wisdom so 
that researchers are free to use the full power and elegance of LCA computation 
when it is appropriate. 

The remainder of the paper is organized as follows. In Section 2, we provide 
some definitions and initial lemmas. In Section 3, we present a relatively slow 
algorithm for LCA preprocessing. In Section 4, we show how to speed up the 
algorithm so that it runs within the desired time bounds. Finally, in Section 5, 
we answer some algorithmic questions that arise in the paper but that are not 
directly related to solving the LCA problem. 

2 Definitions 

We begin by defining the Least Common Ancestor (LCA) Problem formally. 
Problem 1. The Least Common Ancestor (LCA) problem: 

Structure to Preprocess: A rooted tree T having n nodes. 

Query: For nodes u and v of tree T, query lcAt’(m, v) returns the least common 
ancestor of u and v in T, that is, it returns the node furthest from the root 
that is an ancestor of both u and v. (When the context is clear, we drop the 
subscript T on the LCA.) 

The Range Minimum Query (RMQ) Problem, which seems quite different 
from the LCA problem, is, in fact, intimately linked. 

Problem 2. The Range Minimum Query (RMQ) problem: 

Structure to Preprocess: A length n array A of numbers. 

Query: For indices i and j between 1 and n, query RMQ^(a;, y) returns the index 
of the smallest element in the subarray A\i . . . j]. (When the context is clear, 
we drop the subscript A on the RMQ.) 

In order to simplify the description of algorithms that have both preprocess- 
ing and query complexity, we introduce the following notation. If an algorithm 
has preprocessing time f(n) and query time g(n), we will say that the algorithm 
has complexity {f{n), g{n)). 

Our solutions to the LCA problem are derived from solutions to the RMQ 
problem. Thus, before proceeding, we reduce the LCA problem to the RMQ 
problem. The following simple lemma establishes this reduction. 
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Lemma 1. If there is an {f{n), g{n))-time solution for RMQ, then there is an 
(/(2n — 1) + 0(n), g(2n — 1) + 0{l))-time solution for LCA. 

As we will see, the 0{n) term in the preprocessing comes from the time needed 
to create the soon-to-be-presented length 2n — 1 array, and the 0(1) term in the 
query comes from the time needed to convert the RMQ answer on this array to 
an LCA answer in the tree. 

Proof: Let T be the input tree. The reduction relies on one key observation: 

Observation 2 The LCA of nodes u and v is the shallowest node encountered 
between the visits to u and to v during a depth first search traversal ofT. 

Therefore, the reduction proceeds as follows. 

1. Let array E[l, . . . , 2n — 1] store the nodes visited in an Euler Tour of the 
tree T. ^ That is, E[i] is the label of the ith node visited in the Euler tour 
of T. 

2. Let the level of a node be its distance from the root. Compute the Level 
Array L[l, . . . , 2n — 1], where L\i] is the level of node E\i] of the Euler Tour. 

3. Let the representative of a node in an Euler tour be the index of first 
occurrence of the node in the tour^; formally, the representative of i is 
argminj{E[j] = i}. Compute the Representative Array i?[l,...,n], where 
R[i] is the index of the representative of node i. 

Each of these three steps takes 0{n) time, yielding 0{n) total time. To 
compute LCAr{x,y), we note the following: 

— The nodes in the Euler Tour between the first visits to u and to v are 
E[R[m],...,R[z;]] (or E[R[v], . . . , R[u]]). 

— The shallowest node in this subtour is at index rmQ£,(R[m], i?[v]), since L[i] 
stores the level of the node at E[i], and the RMQ will thus report the position 
of the node with minimum level. (Recall Observation 2.) 

— The node at this position is E[RMQ 2 ^(i?[M], which is thus the output 

of LCAt’(m, v). 

Thus, we can complete our reduction by preprocessing Level Array L for RMQ. 
As promised, L is an array of size 2n— 1, and building it takes time 0(n). Thus, 
the total preprocessing is /(2n — 1) + 0{n). To calculate the query time observe 
that an LCA query in this reduction uses one RMQ query in L and three array 
references at 0(1) time each. The query thus takes time g{2n — 1) -I- 0(1), and 
we have completed the proof of the reduction. ■ 



^ The Euler Tour of T is the sequence of nodes we obtain if we write down the label 
of each node each time it is visited during a DFS. The array of the Euler tour has 
length 2n — 1 because we start at the root and subsequently output a node each 
time we traverse an edge. We traverse each of the n — 1 edges twice, once in each 
direction. 

^ In fact, any occurrence of i will suffice to make the algorithm work, but we consider 
the first occurrence for the sake of concreteness. 
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From now on, we focus only on RMQ solutions. We consider solutions to the 
general RMQ problem as well as to an important restricted case suggested by 
the array L. In array L from the above reduction adjacent elements differ by +1 
or —1. We obtain this ±1 restriction because, for any two adjacent elements in 
an Euler tour, one is always the parent of the other, and so their levels differ by 
exactly one. Thus, we consider the ±1-RMQ problem as a special case. 



2.1 A Naive Solution for RMQ 

We first observe that RMQ has a solution with complexity {0{n?), 0(1)): build 
a table storing answers to all of the n? possible queries. To achieve 0{ri?) prepro- 
cessing rather than the 0{rt’) naive preprocessing, we apply a trivial dynamic 
program. Notice that answering an RMQ query now requires just one array 
lookup. 



3 A Faster RMQ Algorithm 

We will improve the (O(n^), 0(l))-time brute-force table algorithm for (gen- 
eral) RMQ. The idea is to precompute each query whose length is a power of 
two. That is, for every i between 1 and n and every j between 1 and logn, 
we find the minimum element in the block starting at i and having length 2^ , 
that is, we compute M[i,j] = argminj.^j ;_,_ 2 i_i{A[k]}. Table M therefore has 
size O(nlogn), and we fill it in time O(nlogn) by using dynamic programming. 
Specifically, we find the minimum in a block of size 2^ by comparing the two min- 
ima of its two constituent blocks of size 2-1“^. More formally, M[i, j] = M[i,j — 1] 
if A[M[i,j - 1]] < M[i + 2^-1 - 1, j - 1] and M[i,j] = M[i + - 1,J - 1] 

otherwise. 

How do we use these blocks to compute an arbitrary rmq(i, j)? We select 
two overlapping blocks that entirely cover the subrange: let 2* be the size of the 
largest block that fits into the range from i to j, that is let k = [log(j — f)j. 
Then RMQ(i, j) can be computed by comparing the minima of the following two 
blocks: z to z -I- 2^^ — 1 (M(z, k)) and j — 2^^ -I- 1 to j {M{j — 2^ + 1, k)). These 
values have already been computed, so we can find the RMQ in constant time. 

This gives the Sparse Table (ST) algorithm for RMQ, with complexity 
(O(zzlogn), 0(1)). Notice that the total computation to answer an RMQ query 
is three additions, 4 array reference and a minimum, in addition to two other op- 
erations: a log and a floor. These can be seen together as the problem of finding 
the most significant bit of a word. Notice that we must have one such operation 
in our algorithm, since Harel and Tarjan [HT84] showed that LCA computa- 
tion has a lower bound of l7(loglogzz) on a pointer machine. Furthermore, the 
most-significant-bit operation has a very fast table lookup solution. 

Below, we will use the ST algorithm to build an even faster algorithm for the 
±1RMQ problem. 
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4 An (0(n), 0(1))-Time Algorithm for ihlRMQ 

Suppose we have an array A with the ±1 restriction. We will use a table-lookup 
technique to precompute answers on small subarrays, thus removing the log 
factor from the preprocessing. To this end, partition A into blocks of size . 
Define an array A'[l,. . . ,2n/logn], where A'[i] is the minimum element in the 
ith block of A. Define an equal size array B, where B[i] is a position in the tth 
block in which value A'[i] occurs. Recall that RMQ queries return the position 
of the minimum and that the LCA to RMQ reduction uses the position of the 
minimum, rather than the minimum itself. Thus we will use array B to keep 
track of where the minima in A' came from. 

The ST algorithm runs on array A' in time (0(n), 0(1)). Having prepro- 
cessed A for RMQ, consider how we answer any query rmq(z,j) in A. The 
indices i and j might be in the same block, so we have to preprocess each block 
to answer RMQ queries. If t < j are in different blocks, the we can answer the 
query RMQ(i, j) as follows. First compute the values: 

1. The minimum from i forward to the end of its block. 

2. The minimum of all the blocks in between between i’s block and j’s block. 

3. The minimum from the beginning of j’s block to j. 

The query will return the position of the minimum of the three values computed. 
The second minimum is found in constant time by an RMQ on A, which has 
been preprocessed using the ST algorithm. But, we need to know how to answer 
range minimum queries inside blocks to compute the first and third minima, and 
thus to finish off the algorithm. Thus, the in-block queries are needed whether i 
and j are in the same block or not. 

Therefore, we focus now only on in-block RMQs. If we simply performed 
RMQ preprocessing on each block, we would spend too much time in prepro- 
cessing. If two block were identical, then we could share their preprocessing. 
However, it is too much to hope for that blocks would be so repeated. The fol- 
lowing observation establishes a much stronger shared-preprocessing property. 

Observation 3 If two arrays A[l, . . . , fc] and Y[l, . . . ,k] differ by some fixed 
value at each position, that is, there is a c such that X[i] = Y[i] + c for every i, 
then all RMQ answers will be the same for X and Y. In this case, we can use 
the same preprocessing for both arrays. 

Thus, we can normalize a block by subtracting its initial offset from every 
element. We now use the ±1 property to show that there are very few kinds of 
normalized blocks. 

Lemma 4. There are 0{y/n) kinds of normalized blocks. 

Proof: Adjacent elements in normalized blocks differ by -1-1 or —1. Thus, nor- 
malized blocks are specified by a ±1 vector of length (1/2 • logn) — 1. There are 
2 (i/ 2 -iogn)-i _ such vectors. ■ 
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We are now basically done. We create 0{^/n) tables, one for each possible 
normalized block. In each table, we put all = 0(log^ n) answers to all in- 

block queries. This gives a total of 0{^/nlog^ n) total preprocessing of normalized 
block tables, and 0(1) query time. Finally, compute, for each block in A, which 
normalized block table it should use for its RMQ queries. Thus, each in-block 
RMQ query takes a single table lookup. 

Overall, the total space and preprocessing used for normalized block tables 
and A' tables is 0(n) and the total query time is 0(1). We show a complete 
example below. 

4.1 Wrapping Up 

We started out by showing a reduction from the LCA problem to the RMQ 
problem, but with the key observation that the reduction actually leads to a 
±1RMQ problem. 

We gave a trivial (O(n^), 0(l))-time table-lookup algorithm for RMQ, and 
show how to sparsify the table to get a (O(nlogn), 0(l))-time table-lookup 
algorithm. We used this latter algorithm on a smaller summary array A' and 
needed only to process small blocks to finish the algorithm. Finally, we notice 
that most of these blocks are the same, from the point of view of the RMQ 
problem, by using the ±1 assumption given by the original reduction. 



5 A Fast Algorithm for RMQ 

We have a (0(n), 0(1)) ±1RMQ. Now we show that the general RMQ can be 
solved in the same complexity. We do this by reducing the RMQ problem to the 
LCA problem! Thus, to solve a general RMQ problem, one would convert it to 
an LCA problem and then back to a ±1RMQ problem. 

The following lemma establishes the reduction from RMQ to LCA. 

Lemma 5. If there is a {0{n), 0(1)) solution for LCA, then there is a 
{0{n), 0(1)) solution for RMQ. 

We will show that the 0(n) term in the preprocessing comes from the time 
needed to build the Cartesian Tree of A and the 0(1) term in the query comes 
from the time needed to covert the LCA answer on this tree to an RMQ answer 
on A. 

Proof: Let A[l, ... ,n] be the input array. 

The Cartesian Tree of an array is defined as follows. The root of a Cartesian 
Tree is the minimum element of the array, and the root is labeled with the 
position of this minimum. Removing the root element splits the array into two 
pieces. The left and right children of the root are the recursively constructed 
Cartesian trees of the left and right subarrays, respectively. 

A Cartesian Tree can be built in linear time as follows. Suppose Ci is the 
Cartesian tree of A[l, . . . , i]. To build Ci+i, we notice that node i+1 will belong 
to the rightmost path of Ci+\, so we climb up the rightmost path of Ci until 
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finding the position where z+1 belongs. Each comparison either adds an element 
to the rightmost path or removes one, and each node can only join the rightmost 
path and leave it once. Thus the total time to build Cn is 0{n). 

The reduction is as follows. 

~ Let C be the Cartesian Tree of A. Recall that we associate with each node 
in C the corresponding corresponding to A[i\ with the index i. 

Claim. RMQ^(z, j) = LCAc(z, j). 

Proof: Consider the least common ancestor, k, of i and j in the Cartesian Tree C. 
In the recursive description of a Cartesian tree, k is the first node that separates 
i and j. Thus, in the array A, element A[k] is between elements A[i\ and A[j\. 
Furthermore, A[k\ must be the smallest such element in the subarray A[z, . . . , j] 
since otherwise, there would be an smaller element k' in A[i , . . . , j] that would 
be an ancestor of k in C, and i and j would already have been separated by k' . 

More concisely, since k is the first element to split i and j, it is between them 
because it splits them, and it is minimal because it is the first element to do so. 
Thus it is the RMQ. □ 

We see that we can complete our reduction by preprocessing the Cartesian 
Tree C for LCA. Tree C takes time 0{n) to build, and because C is an n node 
tree, LCA preprocessing takes 0{n) time, for a total of 0{n) time. The query 
then takes 0(1), and we have completed the proof of the reduction. ■ 
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Abstract. Steiner triple systems are well studied combinatorial designs 
that have been shown to possess properties desirable for the construc- 
tion of multiple erasure codes in RAID architectures. The ordering of 
the columns in the parity check matrices of these codes affects system 
performance. Combinatorial problems involved in the generation of good 
and bad column orderings are defined, and examined for small numbers 
of accesses to consecutive data blocks in the disk array. 



1 Background 

A Steiner triple system is an ordered pair (S', 7~0 where S is a finite set of points 
or symbols and T is a set of 3-element subsets of S called triples, such that 
each pair of distinct elements of S occurs together in exactly one triple of 'T . 
The order of a Steiner triple system (S,T) is the size of the set S, denoted 
|S|. A Steiner triple system of order v is often written as STS(v). An STS(i') 
exists if and only if = 1,3 (mod 6) (see [5], for example). We can relax the 
requirement that every pair occurs exactly once as follows. Let {V, B) be a set 

V of elements together with a collection B of 3-element subsets of V, so that no 
pair of elements of V occurs as a subset of more than one B £ B. Such a pair 
{V,B) is an {n, i)- configuration when n = \V\ and i = \B\, and every element of 

V is in at least one of the sets in B. 

Let C be a configuration (V,B). We examine the following combinatorial 
problems. When does there exist a Steiner triple system (S, 7~0 of order v in which 
the triples can be ordered Tq, , Tt-i, so that every £ consecutive triples form 
a configuration isomorphic to C? Such an ordering is a C- ordering of the Steiner 
triple system. When we treat the first triple as following the last (and hence 
cyclically order the triples), and then enforce the same condition, the ordering 
is a C- cyclic ordering. The presence of configurations in Steiner triple systems 
has been studied in much detail; see [5] for an extensive survey. Apparently, 
the presence or absence of configurations among consecutive triples in a triple 
ordering of an STS has not been previously examined. 

Our interest in these problems arises from an application in the design of 
erasure codes for disk arrays. Prior to examining the combinatorial problems 
posed, we explore the disk array application. As processor speeds have increased 
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rapidly in recent years, one method of bridging the Input-Output (I/O) per- 
formance gap has been to use redundant arrays of independent disks (RAID) 
[9]. Individual data reads and writes are striped across multiple disks, thereby 
creating I/O parallelism. Encoding redundant information onto additional disks 
allows reconstruction of lost information in the presence of disk failures. This 
creates disk arrays with high throughput and good reliability. However, an array 
of disks has a substantially greater probability of a disk failure than does an in- 
dividual disk [8,9]. Indeed, Hellerstein et al. [8] have shown that the reliability 
of an array of 1000 disks which protects against one error, even with periodic 
daily or weekly repairs, has a lower reliability than an individual disk. Most 
systems that are available currently handle only one or two disk failures [15]. 
As arrays grow in size, the need for greater redundancy without a reduction in 
performance becomes important. 

A catastrophic disk failure is an erasure. When a disk fails all of the informa- 
tion is lost or erased. Codes that can correct for n erasures are called n- erasure 
eorrecting eodes. The minimum number of additional disks that must be accessed 
for each write in an n-erasure code, the update penalty, has been shown to be n 
[1,8]. Chee, Colbourn, and Ling [1] have shown that Steiner triple systems pos- 
sess properties that make them desirable 3-erasure correcting codes with minimal 
update penalties. The correspondence between Steiner triple systems and parity 
check matrices is that used by Hellerstein et al. [3,8]. Codewords in a binary 
linear code are viewed as vectors of information and check bits. The code can 
then be defined in terms of a c x (k + c) parity check matrix, H = [P|/] where 
k is the number of information disks, / is the c x c identity matrix and P is a 
c X k matrix that determines the equations of the check disks. The columns of 
P are indexed by the k information disks. The columns of I and the rows of 
H are indexed by the c check disks. A set of disk failures is recoverable if and 
only if the corresponding set of equations in its parity check matrix is linearly 
independent [1,8]. Any set of t binary vectors is linearly independent over GF[2] 
if and only if the vector sum modulo two of those columns, or any non-empty 
subset of those columns, is not equal to the zero vector [8] . 



1 0 0 0 o| 
10 0 0 
0 10 0 0 
0 10 0 0 
0 0 10 0 
0 0 10 0 
0 0 0 1 
0 0 0 1 
0 0 0 0 
0 0 0 0 
0 0 0 0 o| 
0 0 0 0 0 



information 
[0 0000000 
1 1 1 I 1 0 0 0 
00000111 
10000100 
01000010 
10000001 
01000000 
00100100 
00010000 
00100000 
00001000 
00010010 
00001001 



0 0 0 0 0 
0 0 0 0 0 
110 0 0 
0 0 111 
0 0 0 0 0 
0 0 0 0 0 
10 10 0 
0 0 0 0 0 
0 110 0 
0 10 10 
1 0 0 0 1 
0 0 0 0 1 
0 0 0 1 0 
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0 0 0 0 0 0 0 
0 0 0 0 0 0 0 
0 0 0 0 0 0 0 
0 0 0 0 0 0 0 
1 1 1 0 0 0 0 
10 0 110 0 
0 0 0 0 0 1 1 
0 10 10 10 
0 0 10 10 0 
1 0 0 0 0 0 1 
0 10 0 10 0 
0 0 0 1 0 0 1 
0 0 1 0 0 1 0 
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1000000000000 
0100000000000 
0010000000000 
0001000000000 
0000100000000 
0000010000000 
0000001000000 
0000000100000 
0000000010000 
0000000001000 
0000000000100 
0000000000010 
0000000000001 




Fig. 1. Steiner (13) Parity Check Matrix: The shaded disks are check disks. 
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Figure 1 shows a parity check matrix for an STS(13). Cohen and Colbourn 
[3] examine the ordering of columns in the parity check matrices. This departs 
from the standard theory of error correcting codes where the order of columns 
in a parity check matrix is unimportant [16]. 

One particular class of these codes, anti-Pasch Steiner triple systems, has 
been shown to correct for all 4-erasures except for bad erasures [1]. A bad erasure 
is one that involves an information disk and all three of its check disks. 




Fig. 2. Pasch Configuration 



Figure 2 shows six elements {a, b, c, d, e, /} and four triples {a, 6, c},{a, d, e}, 
{/, 6, d} and {/, c, e}. These form a (6,4)-configuration called a Pasch configu- 
ration or quadrilateral [14]. The points represent the check disks (rows of the 
parity check matrix). Each triple represents an information disk (column of the 
parity check matrix) . If we convert this diagram to a (portion of a) parity check 
matrix we find that if all four information disks fail there is an irrecoverable 
loss of information. The resulting four columns in the parity check matrix are 
linearly dependent and therefore cannot be reconstructed. Anti-Pasch Steiner 
triple systems yield codes which avoid this configuration. The existence of anti- 
Pasch STSfv) for all u = 1 or 3 (mod 6) except when v=7 or 13 was recently 
solved [7,14]. 

Cohen and Colbourn [3] examined some of the issues pertaining to encoding 
Steiner triple systems in a disk array. In a multiple erasure correcting disk array, 
there may be an overlap among the check disks accessed for consecutive informa- 
tion disks in reads and writes. The number of disks needed in an individual write 
can therefore be minimized by ordering the columns of this matrix. Using the 
assumption that the most expensive part of reading or writing in a disk array is 
the physical read or write to the disk, this overlap can have a significant effect on 
performance. Cohen and Colbourn [3] describe a write to a triple erasure code 
as follows. First the information disks are read followed by all of their associated 
check disks. In the case when check disks overlap, the physical read only takes 
place once. All of the new parity is computed and then this new parity and the 
new information is written back to the disks. Once again, the shared check disks 
are only physically written to one time. Theoretically, the update penalty is the 
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same for all reads and writes in an array. But when more than one information 
disk in an array shares a common check disk this saves two disk accesses, one 
read and one write. This finally leads to the questions posed at the outset. In 
particular, can one ordering be found that optimizes writes of various sizes in 
such an array? 

In order to derive some preliminary results about ordering we have imple- 
mented a computer simulation [4,3]. RaidSim [9,12,13] is a simulation program 
written at the University of California at Berkeley [12]. Holland [9] extends it 
to include declustered parity and online reconstruction. The raidSim program 
models disk reads and writes and simulates the passage of time. The modified 
version from [9] is the starting point for our experiments. RaidSim is extended to 
include mappings for Steiner triple systems and to tolerate multiple disk failures 
and to detect the existence of unrecoverable four and five erasures [4] . 

The performance experiments are run with a simulated user concurrency 
level of 500. Six Steiner triple systems of order 15 are used in these experiments. 
These are the systems numbered 1, 2, 20, 38, 67 and 80 in [5]. There are 80 non- 
isomorphic systems of order 15. The number of Pasch configurations in STS(15) 
range from 105 in STS(15) system one to zero in STS(15) system 80. 

2 Pessimal Ordering 

A worst triple ordering is one in which consecutive triples are all disjoint. Indeed, 
if the reads and writes involve at most £ consecutive data blocks, a worst triple 
(or column) ordering is one in which each set of £ consecutive triples has all 
triples disjoint. Let D( be the (3£, £)-configuration consisting of £ disjoint triples. 
A pessimal ordering of an STS is a D^-ordering. It is easily seen that a Di+\- 
ordering is also a Z?^-ordering. 

The unique STS(7) has no D 2 -ordering since every two of its triples intersect. 
The unique STS(9) has no I? 2 -ordering, as follows. Consider a triple T. There are 
exactly two triples, T' and T" disjoint from T (and indeed T' and T” are disjoint 
from each other as well). Without loss of generality, suppose that T' precedes 
T and T" follows T in the putative ordering. Then no triple can precede T' or 
follow T" . These two small cases are, in a sense, misleading. Both STS(13)s and 
all eighty STS(15)s admit not only a I? 2 -ordering, but also a Ds-cyclic ordering. 
This is easily established using a simple backtracking algorithm. 

We establish a general result: 

Theorem 1. For v = 1, 3 (mod 6) and v > 9£ — 6, there exists an STS(v) with 
a Df-ordering. 

Proof. When v = 3 (mod 6), there is a Steiner triple system (S', T) of order v 
in which the triples can be partitioned into {v— l)/2 classes i?i, . . . ,R(v-i}/2, so 
that within each class all triples are disjoint. Each class Ri contains v/3 triples. 
(This is a Kirkman triple system; see [5].) When u = 1 (mod 6) and v > 19, 
there is a Steiner triple system (S, T) of order v in which the triples can be 
partitioned into (v + l)/2 classes i?i, . . . ,R(v+i)/27 so that within each class all 
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triples are disjoint. R\ contains {y — l)/6 triples, and each other class contains 
(v — l)/3. (This is a Hanani triple system; see [5].) 

Our orderings place all triples of Ri before all triples of i?i+i for each 1 < z < 
s. We must order the triples of each class Ri. To do this, we first order the triples 
of Ri arbitrarily. Let us then suppose that Ri, . . . , Ri-i have been ordered. To 
select the jth triple in the ordering of Ri for 1 < j < ^, we choose a triple which 
has not already been chosen in Ri, and which does not intersect any of the last 
i — j triples in Ri-i- Such a triple exists, since j — 1 triples have been chosen, 
and at most 3{£ — j) triples of Ri intersect any of the last £ — j triples of 
but — j) + J — 1 < 3f — 2 < [u/3j for all j > 1. 

A similar proof yields 19^-cyclic orderings when v is larger. What is striking 
about the computational results for order 15 is not that an ordering for some 
system can be found, but that every system has a D^-cyclic ordering. This 
suggests the possibility that for v sufficiently large, every STS(u) admits a Di~ 
cyclic ordering. To verify this, form the t-inter section graph Gt of a triple system 
(S', R) by including a vertex for each triple in R, and making two vertices adjacent 
if the corresponding triples share t elements. A Zl 2 -cyclic ordering of (S, R) is 
equivalent to a Hamilton cycle in Gq. But more is true. A H^-cyclic ordering of 
(S, T) is equivalent to the {£ — l)st power of a Hamilton cycle in Gq. Komlos, 
Sarkozy, and Szemeredi [11] establish that for any £ > 0 and any sufficiently 
large n- vertex graph G of minimum degree at least n, G contains 

the fcth power of a Hamilton cycle. Now Go has v(v — l)/6 vertices and degree 
(u(u — 10) + 21)/6, and so Gq meets the required conditions. Thus when £ is fixed, 
every sufficiently large STS(v) admits a G^-ordering. We give a direct proof of 
this, which does not rely upon probabilistic methods. 

Theorem 2. For£ a positive integer and v > 81(£— 1) + 1, every STS{v) admits 
a Di-ordering. 

Proof. Let (S', 7") be an STS(u). Form the 1-intersection graph Gi of (S, T). 
Gi is regular of degree 3(y — l)/2, and therefore has a proper vertex coloring 
in s < 3(u — l)/2 colors. Form a partition of T, defining classes i?i , . . . ,Rs of 
triples by placing a triple in the class Ri when the corresponding vertex of G\ 
has the zth color. Let us suppose without loss of generality that \Ri\ < for 

1 < z < s. Now if 3|i?i| < |i?s|, there must be a triple of Rg intersecting no triple 
of Ri . When this occurs, move such a triple from Rg to i?i . This can be repeated 
until 3|i?i| > \Rg\. Since 1^*1 = ~ 1 )/ 6 j that \Rg\ > \v/R\ and 

thus |i?i| > l’u/27]. But then |i?i| > 3£ — 3, and we can apply precisely the 
method in the proof of Theorem 1 to produce the ordering required. 

The bound on v can almost certainly be improved upon. Indeed for 7=3, 
we expect that every STS(u) with z; > 13 has a Ga-cyclic ordering. 
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3 Optimal Ordering 

Optimal orderings pose more challenging problems. We wish to minimize rather 
than maximize the number of check disks associated with I consecutive triples. 
We begin by considering small values of i. When £ = 2, define the configuration 
I 2 to be the unique (5,2)-configuration, which consists of two intersecting triples. 
An optimal ordering is an / 2 -ordering. Horak and Rosa [10] essentially proved 
the following: 

Theorem 3. Every STS(v) admits an l 2 -cyclic ordering. 

Proof. The 1-intersection graph Gi of the STS has a hamiltonian cycle [10]. 

Let T be the unique (6,3)-configuration, as depicted in Figure 3. 




Fig. 3. Optimal Ordering on Three Blocks 



An optimal ordering when / = 3 is a T-ordering. The unique STS(7) has a 
T-cyclic ordering: 013, 124, 026, 156, 235, 346, 045. The unique STS(9) also has 
a T-cyclic ordering: 012, 036, 138, 048, 147, 057, 345, 237, 246, 678, 258, 156. 
We might therefore anticipate that every STS(v) has a T-cyclic ordering, but 
this does not happen. To establish this, we require a few definitions. A proper 
subsystem of a Steiner triple system (S,T) is a pair {S' ,T') with S' C S and 
T' C T, [A'l > 3, and {S' ,T') itself a Steiner triple system. A Steiner space is a 
Steiner triple system with the property that every three elements which do not 
appear together in a triple are contained in a proper subsystem. 

Theorem 4. No Steiner space admits a T-ordering. Hence, whenever we have 
u = 1, 3 (mod 6), V > 15, and w ^ {19, 21, 25, 33, 37, 43, 51, 67, 69, 145}, there 
is a Steiner triple system admitting no T-ordering. 

Proof. Suppose that (5, T) is a Steiner space which has a T-ordering. Then 
consider two consecutive triples under this ordering. These are contained within 
a proper subsystem. Any triple preceding or following two consecutive triples of 
a subsystem must also lie in the subsystem. But this forces all triples of T to lie 
in the subsystem, which is a contradiction. 
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The conditions on v reflect the current knowledge about the existence of 
Steiner spaces (see [5]). 

A much weaker condition suffices to establish that there is no T-ordering. 
A T-ordering cannot have any two consecutive triples which appear together 
in a proper subsystem. By the same token, a T-ordering cannot have any two 
triples which appear together in a proper subsystem and are separated by only 
one triple in the ordering. Hence the strong condition on subsystems enforced in 
Steiner spaces can be relaxed. Of course, our interest is in producing Steiner triple 
systems that do admit T-orderings. Both STS(13)s admit T-orderings but not 
cyclic T-orderings. Of the 80 STS(15)s, only fourteen admit cyclic T-orderings; 
they are numbers 20, 22, 38, 39, 44, 48, 50, 51, 52, 53, 65, 67, 75, and 76. However, 
73 of the systems (those numbered 8-80) admit a T-ordering. System 1 is the 
projective triple system and hence is a Steiner space (see [5]). However, the six 
systems numbered 2-7 also do not appear to admit a T-ordering. These results 
have all been obtained with a simple backtracking algorithm. 

General constructions for larger orders appear to be difficult to establish. 
However, we expect that for every order u > 15 there exists a system having a 
T-cyclic ordering. For example, let T^o = {t, 5 -I- i, 11 -I- i}, Tn = {i,2 + i,9 + i}, 
and Ti 2 = {1 -I- z, 2 -|- i, 5 -I- t}, with arithmetic modulo 19. Then an STS(19) 
with a T-cyclic ordering exists with the triple T^ in position 27i + j mod 57 for 
0 < z < 19 and 0 < j < 3. A similar solution for v = 25 is obtained by setting 
T^q = {z, 1 -f z, 6 -f z}, Ti\ = {6 -f z, 8 -f z, 16 -t- z}, T ^2 = {1 -t- z, 8 -t- z, 22 -t- z}, and 
Tis = {3 -I- z, 6 -I- z, 22 -I- z}, arithmetic modulo 100. Place triple T^ in position 
32z -I- j modulo 100. While these small designs indicate that specific systems 
admitting an ordering can be easily found, we have not found a general pattern 
for larger orders. 

When £ = 4, four triples must involve at least six distinct elements. Indeed, 
the only (6,4)-conflguration is the Pasch configuration. It therefore appears that 
the best systems from an ordering standpoint (when £ = 4) are precisely those 
which are poor from the standpoint of erasure correction. However, in our perfor- 
mance experiments, ordering plays a larger role than does the erasure correction 
capability [4,3]. Hence it is sensible to examine STSs which admit orderings with 
Pasch configurations placed consecutively. Unfortunately, this does not work in 
general: 

Lemma 1. No STS{v) for v > 3 is Pasch- orderable. 

Proof. Any three triples of a Pasch configuration lie in a unique Pasch config- 
uration. Hence four consecutive triples forming a Pasch configuration for some 
triple ordering can neither be preceded nor followed by a triple which forms a 
second Pasch configuration. 

It therefore appears that an optimal ordering has exactly |" (u(w — 1)/6) — 3] of 
the sets of four consecutive triples inducing a Pasch configuration; these alternate 
with sets of four consecutive triples forming a (7,4)-conflguration. We have not 
explored this possibility. 
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A general theory for all values of i would be worthwhile, but appears to be 
substantially more difficult than for pessimal orderings. 

4 Conclusions 

It is natural to ask whether the orderings found here have a real impact on disk 
array performance. Figures 4-6 show the results of performance experiments 
using various orderings. The desired orderings will provide the lowest response 
times. The ‘good’ ordering is a T-ordering when one exists, and otherwise is an 
ordering found in an effort to maximize the number of consecutive T configura- 
tions; it is labeled A in these figures. The ‘bad’ ordering is a £> 3 -ordering and is 
labeled B. The ordering labeled C is one obtained from a random triple ordering. 
The most significant difference in performance arises in a workload of straight 
writes. This is as expected because this is where decreasing the actual update 
penalty has the greatest impact. Although the read workload shows no apparent 
differences during fault-free mode, it does start to differentiate when multiple 
failures occur. 
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Fig. 4. Ordering Results - Straight Write Workload 



The structure of optimal orderings for £ > 4 is an open and interesting ques- 
tion. Minimizing disk access through ordering means that the update penalty 
is only an upper bound on the number of accesses in any write. By keeping 
the number of check disk accesses consistently lower, performance gains can be 
achieved. An interesting question is the generalization for reads and writes of 
different sizes: Should an array be configured specifically for a particular size 
when optimization is desired? One more issue in optimization of writes in triple 
erasure codes is that of the large or stripe write [9]. At some point, if we have a 
large write in an array, all of the check disks are accessed. There is a threshold 
beyond which it is less expensive to read all of the information disks, compute 
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the new parity and then write out all of the new information disks and all of 
the check disks. When using an STS(15), a threshold occurs beyond the halfway 
point. An STS(15) has 35 information and 15 check disks. If 23 disks are to be 
written they must use at least 14 check disks. In the method of writing described 
above, 46 information accesses and 28 check disk accesses yield a total of 74 phys- 
ical disk accesses. A large write instead has 35 reads, followed by 15 -I- 23 = 28 
writes, for a total of 73 physical accesses. This threshold for all STSs determines 
to some extent how to optimize disk writes. 

Steiner triple systems provide an interesting option for redundancy in large 
disk arrays. They have the unexpected property of lowering the expected update 
penalty when ordered optimally. 



Mixed Workload Comparision of Orderings for STS(15) 
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Abstract. Combinatorial designs find numerous applications in com- 
puter science, and are closely related to problems in coding theory. Pack- 
ing designs correspond to codes with constant weight; 4-sparse partial 
Steiner triple systems (4-sparse PSTSs) correspond to erasure-resilient 
codes able to correct all (except for “bad ones”) 4-erasures, which are 
useful in handling failures in large disk arrays [4,10]. The study of poly- 
topes associated with combinatorial problems has proven to be impor- 
tant for both algorithms and theory. However, research on polytopes for 
problems in combinatorial design and coding theories have been pursued 
only recently [14,15,17,20,21]. In this article, polytopes associated with 
t-{v, k, X) packing designs and sparse PSTSs are studied. The subpacking 
and sparseness inequalities are introduced. These can be regarded as rank 
inequalities for the independence systems associated with these designs. 
Conditions under which subpacking inequalities define facets are studied. 
Sparseness inequalities are proven to induce facets for the sparse PSTS 
polytope; some extremal families of PSTS known as Erdos configurations 
play a central role in this proof. The incorporation of these inequalities 
in polyhedral algorithms and their use for deriving upper bounds on the 
packing numbers are suggested. A sample of 4-sparse PSTS{v), v < 16, 
obtained by such an algorithm is shown; an upper bound on the size of 
m-sparse PSTSs is presented. 



1 Introduction 

In this article, polytopes associated with problems in combinatorial design and 
coding theories are investigated. We start by defining the problems in which 
we are interested, and then describe their polytopes and motivations for this 
research. Throughout the paper, we denote by (^) the family of sets {B C V : 
\B\ = k}. Let V > k > t. A t-{v, k, A) design is a pair (U, B) where U is a u-set 
and is a collection of /c-subsets of V called blocks such that every t-subset of 
V is contained in exactly A blocks of B. Design theorists are concerned with the 
existence of these designs. A t-{v, k, A) packing design is defined by replacing the 
condition “in exactly A blocks” in the above definition by “in at most A blocks” . 
The objective is to determine the packing number, denoted by D\{v, k, t), which 
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is the maximum number of blocks in a t-{v, k, A) packing design. The existence 
of a t-{v, k, A) design can be decided by checking whether the packing number 
D\{v, k, t) is equal to A(")/(J). Thus, the determination of the packing number 
is the most general problem and we will concentrate on it. Designs place a central 
role in the theory of error-correcting codes, and, in particular, t-{v, k, 1) packing 
designs correspond to constant weight codes of weight k, length v and minimum 
distance 2{k — t -I- 1). For surveys on packing designs see [18,19]. 

Determining the packing number is a hard problem in general, although the 
problem has been solved for specific sets of parameters. For instance, the exis- 
tence of Steiner Triple Systems (i.e. 2-{v, 3, 1) designs), and the packing number 
for Partial Steiner Triple Systems (PSTS) (i.e. 2-(w,3, 1) packing designs) have 
been settled. On the other hand, the study of triple systems is an active area of 
research with plenty of open problems. Interesting problems arise in the study 
of STSs and PSTSs avoiding prescribed sub-configurations (see the survey [8]). 
Let us denote by STS{v) the Steiner triple system (PSTS{v) for a partial one) 
on V points. A (p, /)-configuration in a (partial) Steiner triple system is a set of 
I blocks (of the (partial) Steiner triple system) spanning p elements. Let m > 4. 
An STS{v) is said to be m-sparse if it avoids every {I + 2, l)-configuration for 
4 < I < m. Erdos (see [12]) conjectured that for all to > 4 there exists an inte- 
ger Vm such that for every admissible v > there exist an m-sparse STS{v). 
Again the objective is to determine the sparse packing number, denoted by 
D(m,v), which is the maximum number of blocks in an m-sparse PSTS{v). 
The 4-sparse PSTSs are the same as anti-Pasch ones, since Pasches are the only 
(6, 4)-configurations. A 4-sparse (or anti-Pasch) STS{v) is known to exist for all 
V = 3 (mod 6) [2]. For the remaining case, i.e. the case ?; = 1 (mod 6), there 
are constructions and partial results. Anti-mitre Steiner triple systems were first 
studied in [6]. The 5-sparse Steiner triple systems are the systems that are both 
anti-Pasch and anti-mitre. Although there are some results on 5-sparse STSs 
[6,13], the problem is far from settled. In spite of Erdos conjecture, no m-sparse 
Steiner triple system is known for m > 6. The study of m-sparse PSTSs gives 
rise to interesting extremal problems in hypergraph theory; in addition, these 
designs have applications in computer science. For instance, the 4-sparse (or 
anti-Pasch) PSTSs correspond to erasure-resilient codes that tolerates all (ex- 
cept bad) 4-erasures, which are useful in applications for handling failures in 
large disk arrays [4,10]. 

Let T> be the set of all packing designs of the same kind and with the same 
parameters (for instance, the set of all 2-(10, 3, 1) packing designs or the set of 
all 5-sparse PS'TS'(IO)). Let P{T>) be the polytope in Ir( given by the convex 
hull of the incidence vectors of the packing designs in T>. Thus, determining the 
packing number associated with T>, amounts to solving the following optimiza- 
tion problem 

{ maximize X)se('^) 

Subject to X & P{T^)- 

If we had a description of P(P) in terms of linear inequalities, this problem 
could be solved via linear programming. Unfortunately, it is unlikely for us to find 
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complete descriptions of polytopes for hard combinatorial problems. On the other 
hand, some very effective computational methods use partial descriptions of a 
problem’s polytope [3] . Therefore, it is of great interest to find classes of facets for 
these polytopes. It is also important to design efficient separation algorithms for 
a class of facets. Given a point outside a polytope and a class of valid inequalities 
for the polytope, a separation algorithm determines an inequality that is violated 
by the point or decides one does not exist. This is fundamental in branch-and-cut 
or other polyhedral algorithms that work with partial descriptions of polytopes. 

Polytopes for general t-{v,k,\) packing designs were first discussed in [14]; 
their clique facets have been determined for all packings with A = 1 and k — t € 
{1,2} for all t and v [16]. A polyhedral algorithm for t-{v,k,l) packings and 
designs was proposed and tested in [17]. A related work that employs incidence 
matrix formulations for 2-(u, k, A) design polytopes can be found in [20]. 

In this paper, we present two new classes of inequalities: the subpacking 
and the sparseness inequalities. They are types of rank inequalities when one 
regards the packing designs as independence systems, as discussed in Section 2. 
In Section 3, we focus on the subpacking inequalities, which are valid inequali- 
ties for both t-(u, k, A) packing designs and sparse PSTSs. We study conditions 
under which these inequalities induce facets for the packing design poly tope. 
In Section 4, we discuss sparseness inequalities. Given m > 4, the ?-sparseness 
inequalities, 2 < I < m, are valid for the m-sparse PSTS polytope, and proven 
to always be facet inducing. In Section 5, we show the results of our branch-and- 
cut algorithm for determining the sparse packing number for 4-sparse PSTS{v) 
with V < 16. The algorithm follows the lines of the one described in [17], but em- 
ploys sparse facets. With these 4-sparse packing numbers in hand, we develop a 
simple bound that uses the previous packing number and Ghvatal-Gomory type 
of cuts to give an upper bound on the next packing numbers. Further research 
is discussed in Section 6. 

2 Independence Systems, Packing Designs and their 
Polytopes 

In this section, we define some terminology about independence systems and 
collect some results we use from the independence system literature. We closely 
follow the notation in [11]. Along the section, we translate the concepts to the 
context of combinatorial designs. 

Let N = {vi,V 2 , ■ ■ ■ ,Vn\ be a finite set. An independence system on is a 
family I of subsets of N closed under inclusion, i.e. satisfying the property: J G I 
and I Q J implies I G I, for all J G I. Any set in I is called independent and any 
set outside I is called dependent. Any minimal (with respect to set inclusion) 
dependent set is called a circuit, and an independent system is characterized 
by its family of circuits, which we denote by C. The independence number of I, 
denoted by a(I), is the maximum size of an independent set in I. Given a subset 
S of N, the rank of S is defined by r{S) = max{] J| : I G I and I C S}. Note 
that a{I) = r{N). 
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If the circuits in C have size 2, G = (N,C) forms a graph with N as the 
nodeset, C as the edgeset and 2 forms the set of independent or stable sets of G. 

Remark 1. {Packing Designs) Given t, v, k, A, let 2 be the family of all t- 
{v, k, A) packing designs on the same v-set V. Let, N = (^), then 2 is clearly an 
independence system on N. The packing number is the independence number. 
Each circuit in C corresponds to a subset of (^) of cardinality A + 1 such that 
its fc-sets contain a common t-subset of V. For A = 1, C is simply formed by the 
pairs of fc-sets of V which intersect in at least t points, and the underlying graph 
is obvious. 

Following the definition in [9], an Erdos configuration of order n, n > 1, in 
a (partial) Steiner triple system is any {n + 2, n)-configuration, which contains 
no {I + 2, ^)-configuration, 1 < / < n. In fact, this is equivalent to requiring that 
4 < I < n, since there cannot be any (4,2)- or (5, 3)-configurations in a PSTS. 

Remark 2. {Sparse PSTSs.) Let 2 be the independence system of the 2—{v, 3, 1) 
packing designs on the same u-set V. Let C be its collection of circuits, namely, 
the family of all pairs of triples of V whose intersection has cardinality 2. Adding 
m-sparseness requirements to 2 amounts to removing from 2 the packing designs 
that are not m-sparse, and adding extra circuits to C. The circuits to be added 
to C are precisely the Erdos configurations of order I, for all 4 < / < m. 

Before we discuss valid inequalities for the independence system polytope, 
we recall some definitions. A polyhedron P C IR” is the set of points satisfying a 
finite set of linear inequalities. A poly tope is a bounded polyhedron. A polyhedron 
P C M" is of dimension k, denoted by dimP = k, if the maximum number of 
affinely independent points in P is fc -I- 1. We say that P is full dimensional if 
dimP = n. Let d G IR” and dg G IR. An inequality cl^x < dg is said to be valid 
for P if it is satisfied by all points of P. A subset F C P is called a face of P if 
there exists a valid inequality d^x < dg such that F = PC\{x G IR” : d'^x = do}; 
the inequality is said to represent or to induce the face F. A facet is a face of 
P with dimension {dimP) — 1. If P is full dimensional (which can be assumed 
w.l.o.g. for independence systems), then each facet is determined by a unique (up 
to multiplication by a positive number) valid inequality. Moreover, the minimal 
system of inequalities representing P is given by the inequalities inducing its 
facets. 

Consider again an independence system 2 on N. The rank inequality associ- 
ated with a subset S' of is defined by 

^ X* < r{S), (1) 

ieS 

and is obviously a valid inequality for the independence system polytope P{2). 
Necessary or sufficient conditions for a rank inequality to induce a facet have 
been discussed [11]. We recall some definitions. A subset S of is said to be 
closed if r{S U {t}) > r{S) -I- 1 for alH G A^ \ S. S is said to be nonseparable if 
r{S) < r{T) + r{S \ T) for all nonempty proper subset T of S. 




Rank Inequalities for Packing Designs and Sparse Triple Systems 109 



A necessary condition for (1) to induce a facet is that S be closed and non- 
separable. This was observed by Laurent [11], and was stated by Balas and Zemel 
[1] for independent sets in graphs. A sufficient condition for (1) to induce a facet 
is given in the next theorem. Let I be an independence system on N and let 
S' be a subset of N. Let C be the family of circuits of 2 and let Cs denote its 
restriction to S. The critical graph of 2 on S, denoted by Gs{2), is defined as 
having S as its nodeset and with edges defined as follows: ii,i 2 G S are adjacent 
if and only if the removal of all circuits of 0$ containing {ii,Z 2 } increases the 
rank of S. 

Theorem 1. (Laurent [11], Chvdtal [5j for graphs) Let S C N. Lf S is closed 
and the critieal graph Gs{2) is eonnected, then the rank inequality (1) associated 
with S induces a facet of the polytope P{2). 

Proposition 1. (Laurent [11], Cornuejols and Sassano ]7[) The following are 
equivalent 

1. The rank inequality (1) induces a facet of P{2). 

2. S is closed and the rank inequality (1) induces a facet of P(2s). 

3 Subpacking Inequalities for t—(v, k, A) Packings 

Let us denote by Pt,v,k,\ the polytope associated with the t-{v, k, A) packing 
designs on the same u-set V, and by 2t^v,k,\ the corresponding independence 
system on = (^). Let S CV. Then, it is clear that r(([^)) = D\{\S\,k,t) and 
the rank inequality associated with is given by 

^ XB<D^{\S\,k,t). (2) 

We call this the subpacking inequality associated with S, which is clearly 
valid for Pt^v,k,\- In this section, we investigate conditions for this inequality 
to be facet inducing. The next proposition gives a sufficient condition for a 
subpacking inequality not to induce a facet. 

Proposition 2. Lf there exists a t-{v, k, A) design, then 

XB<Dx{v,k,f) (3) 

Ml) 

does not induce a facet of Pt^v,k,\- 

Proof. Since there exists a t-{v,k,X) design, it follows D\{v,k,t) = A(")/(j). 
Then, equation (3) can be obtained by adding the clique facets: xb If X, 

for all T CV, \T\ = t. Thus, (3) cannot induce a facet. □ 

The next proposition addresses the extendibility of facet inducing subpacking 
inequalities from Pt,\s\,k,\ to Pt,v,k,\, v > [S']. 
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Proposition 3. Let S CV. Then, the following are equivalent: 

1. the subpacking inequality (2) induces a facet of Pt^v,k,\- 

2. the subpacking inequality (2) induces a facet of (ind for all B' € 

(fc)\(f) Ikere exists t-(jSj, k, A) packing design {S, B) with\B\ = D\{\S\,k,t) 
such that {S,BU {B'}) is a t-(v,k,X) packing design. 

Proof. The last condition in 2 is equivalent to (^) being closed for the indepen- 
dence system Tt,v,k,\', thus, the equivalence comes directly from Theorem 1. □ 

For the particular case of fc = t -I- 1 and A = 1, facet inducing subpacking 
inequalities are always extendible. 

Proposition 4. (Guaranteed extendibility of a class of subpacking facets) Let 
k = t + 1 and A = 1. Then, the subpacking inequality 

Xb < Di{\S\,t+l,t) (4) 

associated with S C V induces a facet for Pt^v^t+i,i if and only if it induces a 
facet for Pt,\s\,t+i,i- 

The proof of Proposition 4 involves showing that the second condition in item 2 
of Proposition 3 holds for A: = t -I- 1 and A = 1. 

Theorem 2. (Facet defining subpacking inequalities for PSPSs) Let S C V, 
[S'! < 10. The subpacking inequality associated with S induces a facet of P 2 ,v, 3 , 1 , 
V > I S'!, if and only if \S\ G {4, 5, 10}. 

Sketch of the proof. Since there exist STS{v) for all u = 1,3 {mod 6), Propo- 
sition 2 covers cases [S'! G {7,9}. It remains to deal with the cases [S'! G 
{4,5,6,8,10} (see Table 1). Subpacking inequalities with [S'! = 4 are facet- 
inducing clique inequalities. For the case [S'] G {5, 10}, we show that the cor- 
responding critical graphs are connected, which (by Theorem 1) is sufficient to 
show the inequalities induce facets of P 2 ,|S|, 3 ,i • Proposition 4 guarantees they also 
define facets of P 2 ,v,s,i- For the case [S'] G {6,8}, we show that the correspond- 
ing subpacking inequalities can be written as (non-trivial) linear combinations 
of other valid inequalities, which implies that they do not induce facets. □ 

Remark 3. (Separation of subpacking inequalities) For a constant C, subpacking 
inequalities with |S'| < C can be separated in polynomial time. This is the case, 
since there are exactly (s) G 0{v^) inequalities to check, which is a 

polynomial in the number of variables of the problem, which is (^) . 



4 Sparseness Facets for Sparse PSTSs 

Let us denote by Pm,v the polytope associated with m-sparse PSTS{v) on the 
same u-set V , and by ^ the corresponding independence system. The main 
contribution of this section is a class of facet inducing inequalities for Pm,v, which 
we call sparseness inequalities, given by Theorem 3. 
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Table 1. Summary of facet inducing subpacking inequalities of P 2 ,v, 3 ,i for IS] < 10. 



|S| 


Di(|S|,3,2) 
or r((f)) 


Es6(|) xb < Di{\S\,Z,2) 
is facet inducing, i> > |S| 


Reference 


4 


1 


Yes 


maximal clique [17] 


5 


2 


Yes 


Theorem 2 


6 


4 


No 


Theorem 2 


7 


7 


No 


3 STS {7) + Proposition 2 


8 


8 


No 


Theorem 2 


9 


12 


No 


3 STS {9) + Proposition 2 


10 


13 


Yes 


Theorem 2 


1,3 

(mod 6) 


(|Sr-|S|)/6 


No 


3 STS(|S|) + Proposition 2 



Lemma 1. (Lefmann et al.[12, Lemma 2.3]) 

Let l,r be positive integers, I > 1. Then any {1 + 2,1 + r)-eonfiguration in a 
Steiner triple system contains a {I + 2, 1) -configuration. 

The proofs of the next two lemmas are omitted in this extended abstract. 

Lemma 2. (Construction of an Erdos configuration, for all n> i) 

Consider the following recursive definition: 

^4 = {El = {1,2, 5}, E2 = (3, 4, 5}, S 3 = (1, 3, 6 }, S 4 = (2, 4, 6 }}, 

^5 =f4\{S4}U({2,4,7},S5 = {5,6,7}} 

^n +1 — Sn \ {En} U {En \ |n + 2} U |n + 3}} U 

U {En+i = (n + 2, n + 3, 1 + ((n — 1) mod 4)}} ,n>5. 

Then, for all n> 4, Sn is an Erdos configuration of order n. 



Lemma 3. Let v—2 > I > 4 and let T he an {l+2)-suhset ofV. Let R G ( 3 )\(f) • 
Then, there exists an Erdos configuration S of order I on the points of T and a 
triple S £ S, such that S \ {S'} U {i?| is an l-sparse PSTS{v). 



Theorem 3. (m-sparseness facets) Let m > 4. Then, for any 2 < I < m and 
any (/ + 2)-subset T ofV, the inequality 

s{T) : ^ xb <1— ^ 



defines a facet for Pm,v 

Proof. Inequalities s{T) with I £ {1,2} are facet inducing for P 2 ,v,s,i (see Ta- 
ble 1), but even though the inclusion Pm,v P 2 ,v, 3 ,i is in general proper, it is 
easy to show they remain facet inducing for Pm,v Thus, we concentrate on I > 4. 
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The validity of s(T) comes from the definition of Z-sparse PSTSs, i.e. the fact 
that r((g)) = /— 1 for Im,v Lemma 3 implies that is closed. Thus, by The- 
orem 1 , it is sufficient to show that the critical graph is connected. 

Let £ be an Erdos configuration of order I on the points of T. There must be 
two triples in £ whose intersection is a single point, call those triples Bi and 
i?2- We claim £ \ {Bi} and £ \ {B 2 } are m-sparse 2-{v,3, 1) packings. Indeed, 
|£i\ {Si}| = \£\ — l = l—l, and since £ was {I — l)-sparse, so is {Bi}, i = 1,2. 
Thus, there exists an edge in the critical graph connecting triples Bx 

and i? 2 . By permuting T, we can show this is true for any pair of triples which 
intersects in one point. That is, there exists an edge in connecting 

Gi and G 2 , for any Gi, G 2 G (3) with |Gi fl G 2 I = 1. It is easy to check that this 
graph is connected. □ 

Remark 4- The following is an integer programming formulation for the opti- 
mization problem associated with Pm,v, in which all the inequalities are facet 
inducing (see Theorem 3). Note that the second type of inequalities can be omit- 
ted from the integer programming formulation, since for integral points they are 
implied by the first type of inequalities (the first type guarantees that a; is a 
PSTS). 



maximize 


Ebc(^) 


















Subject to 


Eb6(^) < 


1, 


for 


all 


T 


C 


u 


\T\~- 


= 4, 




Eb6(^) XB < 


2, 


for 


all 


T 


C 


U 


\T\~- 


= 5, 




Eb6(^) Xb < 


3, 


for 


all 


T 


c 


U 


\T\~- 


= 6, 




Eb6(^) XB < 


m — 1, 


for 


all 


T 


c 


U 


\T\~- 


= m-\-2. 




X G {0, 11 ( 3 ) 



















Remark 5. (Separation of m-sparse facets) For constant m > 4, Lsparse facets, 
I < m, can be separated in polynomial time. This is the case, since there are 
exactly (I) ^ 0 (w’”+^) inequalities to check, which is a polynomial in 

the number of variables of the problem, which is ( 3 ). 



5 Using Facets for Lower and Upper Bonnds 



In this section, we illustrate some interesting uses of valid inequalities for packing 
design problems. Recall that D(m,v) denotes the maximum size of an m-sparse 
PSTS{v). We show an upper bound on D{m,v) based on valid subpacking 
inequalities for m-sparse PSTSs. We also display the results of an algorithm 
that uses 4-sparse facets to determine D{A,v). 

Proposition 5. (Upper hound for m-sparse number) Let m > 4. Then, 



D{m,v) < U{m,v) := 



D{m, u — 1) • u 



V — 3 
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Table 2. The anti-Pasch (4-sparse) PSTS number for small v 





exact* 


upper bounds 


* * 


V 


D(4,u) 


Di(u,3,2) 1/(4, 


v) 




3 


4 


4 


7 


5 


7 


5 


8 


8 


8 


8 


9 


12 


12 


12 


10 


12 


13 


17 


11 


15 


17 


16 


12 


19 


20 


20 


13 


>24 


26 


24 


14 


28 


28 


30 


15 


35 


35 


35 


16 


37 


37 


43 



* results from branch-and-cut algorithm 
** upper bounds from known packing 
numbers and from Proposition 5 

To the best of our knowledge the 
determination of _D(4, u) for v G [10, 13] 
are new results. 



Proof. There are v rank inequalities of the form — D{m,v — 1), 

for T G Each triple appears in v — 3 of these inequalities. Thus, adding 

these inequalities yields ^ left-hand side is 

integral, we take the floor function on the right-hand side. The inequality is valid 
in particular for x being the incidence vector of a maximal m-sparse STS{v), in 
which case the left-hand side is equal to D{m,v). □ 

In Table 2, we show values for D{4, v) obtained by our algorithm. To the gen- 
eral algorithm in [17], we added 4-sparse inequalities. Due to their large number, 
the 4-sparse inequalities are not included in the original integer programming 
formulation, but are added whenever violated. For v = 13, it was not possible to 
solve the problem to optimality but a solution of size 24 was obtained; since this 
matches one of the upper bounds, we conclude D{4, 13) = 24. All other cases 
were solved to optimality. 



6 Conclusions and Further Work 

In this article, we derive and study new classes of valid and facet inducing in- 
equalities for the packing designs and m-sparse PSTS polytopes. We also exem- 
plify how this knowledge can be used in algorithms to construct designs as well 
as for deriving upper bounds on packing numbers. A number of extensions of this 
work could be pursued. For instance, we are currently investigating how to gen- 
eralize results from Table 1 in order to determine the facet inducing subpacking 
inequalities of PSTSs for all [S']. We are also working on the design of separation 
algorithms for m-sparse facets that would be more efficient than the naive one 
which checks all inequalities (see complexity in Remark 5). Other directions for 
further research are the study of other rank inequalities and the investigation of 
new upper bounds on the lines suggested in Section 5. In an expanded version of 
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this article, we intend to include the proofs that were omitted in this extended 

abstract as well as some of the extensions mentioned above. 
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Abstract. We pose and completely solve the existence of pancyclic 2- 
factorizations of complete graphs and complete bipartite graphs. Such 
2-factorizations exist for all such graphs, except a few small cases which 
we have proved are impossible. The solution method is simple but pow- 
erful. The pancyclic problem is intended to showcase the power this 
method offers to solve a wide range of 2-factorization problems. Indeed, 
these methods go a long way towards being able to produce arbitrary 
2-factorizations with one or two cycles per factor. 



1 Introduction 

Suppose that there is a conference being held at Punta del Este, Uruguay. There 
are 2n -I- 1 people attending the conference and it is to be held over n days. Each 
evening there is a dinner which everyone attends. To accommodate the many 
different sizes of conferences, the Las Dunas Hotel has many different sizes of 
tables. In fact, they have every table size from a small triangular table to large 
round tables seating 2n-|- 1 people. When this was noticed, the organizers, being 
knowledgeable in combinatorics, asked themselves if a seating arrangement could 
be made for each evening such that every person sat next to every other person 
exactly once over the course of the conference and each size table was used at 
least once. 

Such a schedule, really a decomposition of K 2 n+i into spanning graphs all 
with degree 2 (collections of cycles), would be an example of a 2-factorization of 
K 2 n+i- Due to their usefulness in solving scheduling problems, 2-factorizations 
have been well studied. The Oberwolfach problem asks for a 2-factorization in 
which each subgraph in the decomposition has the same pattern of cycles and 
much work has been done toward its solution [2,7]. This corresponds to the ho- 
tel using the exact same set of tables each night. Often other graphs besides 
odd complete graphs are investigated. Complete graphs of even order with a 
perfect matching removed so the graph has even degree have received much at- 
tention [1]. In such solutions each person would miss sitting next to exactly one 
other during the conference. Oberwolfach questions have also been posed and 
solved for complete bipartite graphs [7]. The problem posed in the introductory 
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paragraph asks that every size cycle appear and so is called the pancyclic 2- 
factorization problem, or, since it forces such different cycle sizes, the title of 
‘anti-Oberwolfach problem’ emphasizes this contrast. There are analogous for- 
mulations for an even number of people with a complete matching removed (co- 
author avoiding to prevent conflict) and for bipartite graphs as well (the seating 
arrangements alternating computer scientist and mathematician to foster cross 
disciplinary communication) . 

The Conference Organizers soon noted that tables of size n — 1 and n — 2, 
although available, were forbidden since the remaining people would be forced 
to sit at tables of size 1 or 2 which did not exist and would preclude every pair 
being neighbors exactly once. After realizing this and doing a preliminary count, 
the organizers then asked themselves for a schedule that would include the first 
evening with everyone seated around one large table of size 2n -I- 1, an evening 
with a size three table paired with a size 2n — 2 table, an evening with a size 
four table paired with a size 2n — 3 table and so forth up to an evening with size 
n table paired with a size n+1 table. There was one evening remaining and the 
organizers thought it would be nice to have everyone seated again at one table 
for the final dinner together. 

If the solution methods from the Oberwolfach problem can be paired with 
methods for the anti-Oberwolfach problem, then it is conceivable that that gen- 
eral 2-factorization problems can be tackled with great power. This would enable 
us to answer many different and new scheduling and tournament problems. In- 
deed, the pancyclic question is recreational in nature but we use it as a convenient 
context in which to present powerful and very serious construction methods that 
can contribute to a broader class of 2-factorizations. 

Another primary motivation for this problem is recent papers investigating 
the possible numbers of cycles in cycle decompositions of complete graphs [3] 
and in 2-factorizations[4,5]. For each n, the number of cycles that appear in an 
anti-Oberwolfach solution are admissible so the question was asked if this specific 
structure was possible. We show that the answer to all versions of the problem, 
complete odd graphs, complete even graphs minus a complete matching, and 
complete bipartite graphs, is affirmative, except for small cases which we have 
proved impossible. The solution method is very similar to Piotrowski’s approach 
to 2-factorization problems: we modify pairs of Hamiltonian cycles into pairs of 
2-factors with the desired cycle structures. 

In this paper we offer first some definitions and discussion of 2-factorizations, 
formalizing the notions discussed above. Then we solve the standard and bipar- 
tite formulations of the anti-Oberwolfach problem. We end with a discussion of 
the solution method, possible extensions of the problem, and the power these 
methods provide for constructing very general classes of 2-factorizations. 



2 Definitions and Discussion 

Definition 1 A ^-factorization of a graph G, is a decomposition of G into span- 
ning subgraphs all regular of degree k. Each such subgraph is called a fc-factor. 
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We are interested in a special class of 2-factorizations, but also use 1-factors 
(perfect matchings) on occasion. 

Definition 2 A pancyclic 2-factorization of a graph, G, of order n, is a 2- 
factorization of G where a cycle of each admissible size, 3, 4, . . . , n — 4, n — 3, n, 
appears at least once in some 2-factor. 

There is a similar definition for the bipartite graphs in which no odd cycles can 
appear: 

Definition 3 A pancyclic 2-factorization of a bipartite graph, G, of even order 
n, is a 2-factorization ofG where a cycle of each admissible size, 4, 6, , n— 4, n, 
appears at least once in some 2-factor. 

We ask whether such 2-factorizations exist for complete odd graphs AT 2 „+i, com- 
plete even graphs, with a 1-factor removed to make the degree even, K 2 n — 
nK 2 , and complete bipartite graphs, some with a 1-factor removed, K 2 n, 2 n smd 

K2n+l,2n+l ~ (2n -|- 1)AT2- 

In every case, counting shows that the all the 2-factors that are not Hamil- 
tonian (an n-cycle) must be of the form: an i-cycle and a, n — i cycle. We define 
here a notation to refer to the different structure of 2-factors: 

Definition 4 An i, {n — i) -factor is a 2- factor of an order n graph, G, that is 
the disjoint union of an i-cycle and a {n — i)-cycle. 

In each case the solution is similar. For each graph in question, G, we present 
a 2-factorization, {Fq, Fi, . . . , Fj,.}, and a cyclic permutation cr of a subset of the 
vertices of G so that Fi = cr*(Fo) and is the identity. We decompose the 
union of consecutive pairs of 2-factors, Fi U Ti+i, into two other 2-factors with 
desired cycle structures by swapping pairs of edges. The cyclic automorphism 
group guarantees that any two unions of any two consecutive 2-factors are iso- 
morphic. Thus we can formulate general statements about decomposition of 
the complete graphs into these unions and the possible decompositions of these 
unions. A few cases require more sophisticated manipulation. In certain cases 
we swap only one pair of edges; in others, we use up to four such swaps. These 
methods demonstrate the power of Piotrowski’s approach of decomposing pairs 
of Hamiltonian cycles from a Walecki decomposition into the desired 2-factors 

3 Main Results 

We demonstrate the solution method in more detail for the case K 2 n-\-i- In the 
other cases the solution method is essentially the same with minor adjustments 
and a slight loss of economy. 

3.1 The Solution for K 2 n-\-i 

Walecki’s decomposition of AT 2 n-i-i into Hamiltonian 2-factors give us the starting 
point of our construction. 
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Lemma 1. There exists a decomposition of K 2 n+i into Hamiltonian 2- factors 
that are cyclic developments of each other. 

The first of the Hamiltonian 2-factors is shown in Figure 1. Each of the remaining 
n — 1 2-factors is a cyclic rotation of the first. 




Fig. 1. A Walecki 2-factor of K 2 n+i- 



The union of two consecutive Hamilton cycles from the Walecki decomposi- 
tion is isomorphic to the graph given in Figure 2. This graph can be decomposed 



Ij 2j 3 j 



i-1 ^ i j i+1 j i+2 j 



k-2 ^ k-1 , k , 




oo 



Fig. 2. The union of two consecutive Walecki 2-factors. 



into two other Hamiltonian 2-factors that are not identical to the original Walecki 
2-factors. These are shown in Figure 3. It is these two Hamiltonian 2-factors that 
can be decomposed into 2-factors with various cycle structures. 

Lemma 2. The graph in Figure 2 can he decomposed into two 2-factors such 
that the first is an 2i-\-l,2{n — i)-factor and the second is a 2j -\-l,2(ji — j)-factor 
for any l<i^j<n — 2. Alternatively the second can remain a Hamiltonian 
2- factor, with no restriction on i. 
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Ij 2j 3j i-lj i, i+lj i+2j k-2 j k-1 j k j 




Fig. 3. Decomposition of the union of two consecutive Walecki 2-factors into two other 
Hamiltonian 2-factors. 



Proof. The first decomposition is achieved by swapping four edges between the 
two graphs of Figure 3 and is shown in Figure 4. 




Fig. 4. Decomposition into a 2i -|- 1, 2(n — i)-factor and a,2j + 1, 2{n — j')-factor. 



The second decomposition is achieved by swapping only two edges between 
the two graphs of Figure 3 and is shown in Figure 5. 

In both figures the sets of swapped edges are shown as dashed or dotted lines. 

The flexibility of the parameters i and j together with the decomposition of 
^ 2 ™+! into n cyclically derived Hamiltonian 2-factors gives the main result. 

Theorem 1 There exists a pancyclic 2-factorization of K 2 n+i for all n > 1. 
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Fig. 5. Decomposition into a 2i + 1, 2(n — i)-factor and a Hamiltonian 2-factor. 



3.2 The Remaining Cases: — nK 2 i i^ 2 ri, 2 n and 

K 2 n-\-\,:in+l — (2n + 1 )K 2 

Decomposing graphs with odd degree into 2-factors is impossible since each 2- 
factor accounts for two edges from each vertex. In these cases it is customary 
to remove a 1-factor and attempt to decompose the resulting graph which has 
even degree. The existence of pancyclic 2-factorizations of K 2 n — nK 2 , K 2 n, 2 n 
and K 2 n+i, 2 n+i — (2n-|- 1 )K 2 is achieved in the same manner as that of K 2 n+i- 
We decompose the graph into 2-factors (usually Hamiltonian) that are cyclic 
developments of each other, so that all unions of pairs of consecutive 2-factors 
are isomorphic. The union of two consecutive 2-factors has a structure very sim- 
ilar to that in the case K 2 n+i and they can be broken into smaller cycles in 
almost exactly the same manner. There are two minor, though notable, differ- 
ences. When decomposing K 2 n — nK 2 , it is necessary in one fourth of the cases 
to reinsert the removed 1-factor and remove another to be able to construct odd 
numbers of 2-factors with different parities. In the bipartite case it is sometimes 
necessary to apply the edge swapping additional times since the original decom- 
position into 2-factors may not have produced Hamiltonian 2-factors. Again, in 
each case the complete solution is achievable. 

Theorem 2 There exists a pancyclic 2-factorization of K 2 n — nK 2 for alln > 1. 

Theorem 3 There exists a pancyclic 2-factorization of for all even n > 4 
and Kn,n — nK 2 for all odd n > 1. The cases n = 1, 2, 4 are all impossible. 

In all cases the union of the edges from each set of four swapped edges form 
an induced 4-cycle, and the remainder of the graphs are paths connecting pairs 
of vertices from the 4-cycle. This induced 4-cycle and connected paths are the 
underlying structure of the construction. Consideration of this structure allows 
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the swapping to be formalized and made rigorous, so that the proofs can rest 
on a foundation other than nice figures. Unfortunately the statements of these 
swapping lemmas are lengthy and technical and space does not permit their 
inclusion. 



4 Conclusion 

As a demonstration of a powerful method for a wide range of 2-factorization 
problems, of similar type to Piotrowski’s Oberwolfach constructions, we have 
solved the pancyclic 2-factorization for four infinite families of complete or nearly 
complete graphs, K^n+i, AT 2 „— nAT 2 , K 2 n, 2 n and AT 2 „+i, 2 n-i-i — (2n-|-l)Ar2. In each 
case, pancyclic 2-factorizations exist for all n except for a very few small n where 
the solution is shown not to exist. Moreover, in each case the solution method 
is similar. We start with a 2-factorization of the graph in question with a cyclic 
automorphism group. The union of consecutive pairs of the 2-factors is shown 
to be decomposable into two 2-factors with a wide range of cycle structures 
by judicious swapping of the two pairs of opposite edges of induced 4-cycles. 
This flexibility of decomposition and the automorphism group allow the desired 
solution to be constructed. 

The plethora of induced 4-cycles in the union of consecutive 2-factors from 
the various 2-factorizations allow us not only to construct the various solutions 
in many different ways, but to go far beyond the problem solved here. In K 2 n+i 
it seems that the swapping lemmas can only produce one odd cycle per factor 
and at most two in K 2 n — nK 2 - Beyond this restriction there is a great deal of 
flexibility in the application of the swapping lemmas. The use of these methods 
to solve the pancyclic 2-factorization problem indicates the strength and range of 
the swapping lemmas. We propose that the methods outlined in this article might 
be powerful for constructing Oberwolfach solutions, and other 2-factorization 
and scheduling problems. One very interesting problem is the construction of 2- 
factorizations with prescribed lists of cycle types for each 2-factor. If the list can 
only contain 2-factors with one or two cycles, then the methods presented here 
nearly complete the problem. The only obstacle towards the solution of these 
problems is the construction of pairs of 2-factors with the same cycle type, the 
Oberwolfach aspect of the question. P. Gvozdjak is currently working on such 
constructions. 

There are other pancyclic decomposition questions that can be asked. The 
Author and H. Verrall are currently working on the directed analogue of the 
anti-Oberwolfach problem. Other obvious pancyclic problems can be formulated 
for higher A for both directed and undirected graphs; 2-path covering pancyclic 
decompositions, both resolvable and non-resolvable . In each of these cases we 
gain the flexibility to ask for different numbers within each size class of cycles, 
possibly admitting digons and losing other restrictions enforced by the tightness 
of the case solved here. 
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1 Summary 

The subject of this survey is the directed graph induced by the hyperlinks be- 
tween Web pages; we refer to this as the Web graph. Nodes represent static 
html pages and hyperlinks represent directed edges between them. Recent esti- 
mates [5] suggest that there are several hundred million nodes in the Web graph; 
this quantity is growing by several percent each month. The average node has 
roughly seven hyperlinks (directed edges) to other pages, making for a total of 
several billion hyperlinks in all. 

There are several reasons for studying the Web graph. The structure of this 
graph has already led to improved Web search [7,10,11,12,25,34], more accurate 
topic-classification algorithms [13] and has inspired algorithms for enumerating 
emergent cyber-communities [28] . Beyond the intrinsic interest of the structure 
of the Web graph, measurements of the graph and of the behavior of users 
as they traverse the graph, are of growing commercial interest. These in turn 
raise a number of intriguing problems in graph theory and the segmentation 
of Markov chains. For instance, Charikar et al. [14] suggest that analysis of 
surfing patterns in the Web graph could be exploited for targeted advertising 
and recommendations. Fagin et al. [18] consider the limiting distributions of 
Markov chains (modeling users browsing the Web) that occasionally undo their 
last step. 

In this lecture we will cover the following themes from our recent work: 

— How the structure of the Web graph has been exploited to improve the 
quality of Web search. 

— How the Web harbors an unusually large number of certain clique-like sub- 
graphs, and the efficient enumeration of these subgraphs for the purpose of 
discovering communities of interest groups in the Web. 

— A number of measurements of degree sequences, connectivity, component 
sizes and diameter on the Web. The salient observations include: 

1. In-degrees on the Web follow an inverse power-law distribution. 

2. About one quarter the nodes of the Web graph lie in a giant strongly 
connected component; the remaining nodes lie in components that give 
some insights into the evolution of the Web graph. 

3. The Web is not well-modeled by traditional random graph models such 
as Gn.p’ 
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— A new class of random graph models for evolving graphs. In particular, some 
characteristics observed in the Web graph are modeled by random graphs in 
which the destinations of some edges are created by probabilistically copying 
from other edges at random. This raises the prospect of the study of a new 
class of random graphs, one that also arises in other contexts such as the 
graph of telephone calls [3]. 

Pointers to these algorithms and observations, as well as related work, may 
be found in the bibliography below. 
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Abstract. The Clique- width of a graph is an invariant which measures 
the complexity of the graph structures. A graph of bounded tree- width 
is also of bounded Clique- width (but not the converse). For graphs G of 
bounded Clique-width, given the bounded width decomposition of G, ev- 
ery optimization, enumeration or evaluation problem that can be defined 
by a Monadic Second Order Logic formula using quantifiers on vertices 
but not on edges, can be solved in polynomial time. 

This is reminiscent of the situation for graphs of bounded tree-width, 
where the same statement holds even if quantifiers are also allowed on 
edges. Thus, graphs of bounded Clique-width are a larger class than 
graphs of bounded tree-width, on which we can resolve fewer, but still 
many, optimization problems efficiently. 

In this paper we present the first polynomial time algorithm (O(n^m)) 
to recognize graphs of Clique- width at most 3. 



1 Introduction 

The notion of the Clique-width of graphs was first introduced by Courcelle, 
Engelfriet and Rozenberg in [CER93]. The clique- width of a graph G, denoted 
by cwd{G), is defined as the minimum number of labels needed to construct G, 
using the four graph operations: creation of a new vertex v with label i (de- 
noted i(v)), disjoint union (©), connecting vertices with specified labels ( 77 ) and 
renaming labels (p). Note that © is the disjoint union of two labeled graphs, each 
vertex of the new graph retains the label it had previously, pij (i yf j), called the 
“join” operation, causes all edges (that are not already present) to be created 
between every vertex of label i and every vertex of label j. Pi^j causes all ver- 
tices of label i to assume label j . As an example of these notions see the graph in 
Fig. 4 together with its 3-expression in Fig. 4 and the parse tree associated with 
the expression in Fig. 4. A detailed study of clique-width is presented in [C099]. 
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Also a study of the clique- width of graphs with few P4S (i.e., a path on four ver- 
tices) and on perfect graph classes is presented in [MR99,GR99]. For example, 
distance hereditary graphs and Pj-sparse graphs have bounded clique-width (3 
and 4 respectively) whereas unit interval graphs, split graphs and permutation 
graphs all have unbounded clique- width. 

The motivation for studying clique-width is analogous to that of tree-width. 
In particular, given a parse tree which shows how to construct a graph G using 
k labels and the operations above, every decision, optimization, enumeration or 
evaluation problem on G which can be defined by a Monadic Second Order Logic 
formula tp, using quantifiers on vertices but not on edges, can be solved in time 
Cfc • 0{n + m) where Ck is a constant which depends only on ip and k, where n 
and m denote the number of vertices and edges of the input graph, respectively. 
For details, see [CMRa,CMRb]. 

Furthermore clique-width is “more powerful” than tree-width in the sense 
that if a class of graphs is of bounded tree-width then it is also of bounded 
clique-width [C099]. (In particular for every graph G, cwd{G) < -|- 1, 

where twd{G) denotes the tree- width of G). 

One of the central open questions concerning clique-width is determining the 
complexity of recognizing graphs of clique- width at most k, for fixed k. It is 
easy to see that graphs of clique-width I are graphs with no edges. The graphs 
of clique-width at most 2 are precisely the cographs (i.e., graphs without P4) 
[C099]. In this paper we present a polynomial time algorithm to 

determine if a graph has clique-width at most 3. For graphs of Clique-width <3 
the algorithm also constructs the 3-expression which defines the graph. 

An implementation that achieves the 0{n^m) bound would be quite compli- 
cated, because a linear modular decomposition algorithm is needed. However the 
other steps of the algorithm present no major difficulty: we use only standard 
data structures, and the Ma-Spinrad split decomposition algorithm. So if we fall 
back on an easy modular decomposition algorithm (see Sec. 6), there is a slightly 
slower (O(n^mlogn)), easily implementable version of the algorithm. 

Unfortunately, there does not seem to be a succinct forbidden subgraph char- 
acterization of graphs with clique-width at most 3, similar to the Rj-free charac- 
terization of graphs with clique- width at most 2. In fact every cycle C„ with n >7 
has clique-width 4, thereby showing an infinite set of minimal forbidden induced 
subgraphs for Clique- width <3 [MR99]. 

2 Background 

We first introduce some notation and terminology. The graphs we consider in 
this paper are undirected and loop- free. For a graph G we denote by V{G) 
(resp. E{G)) the set of vertices (resp. edges) of G. For X C V{G), we denote 
by G[X] the subgraph of G induced by X. We denote by G \ A the subgraph 
of G induced by V (G)\X. We say that vertex v is universal to X if v is adjacent 
to all vertices in A \ {u} and that v is universal in G if v is universal to U(G). 
On the other hand v misses A if u misses (i.e., is not adjacent to) all vertices 
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in X \ {w}. We denote by N{v) the neighborhood of v in G, i.e., the set of 
vertices in G adjacent to v. We denote by N[v] the closed neighborhood of v, 
i.e., N[v] = N{v) U {f}. 

A labeled graph is a graph with integer labels associated with its vertices, 
such that each vertex has exactly one label. We denote by (G : Vi, , Vp) the 
labeled graph G with labels in {1, . . . ,p} where Vi denotes the set of vertices 
of G having label i (some of these sets may be empty). 

The definition of the Clique-width of graphs (see the introduction) extends 
naturally to labeled graphs. The Clique-width of a labeled graph (G : Vi, ... , Vp) 
denoted by cwd{G : Vi, . . . ,Vp) is the minimum number of labels needed to 
construct G such that all vertices of Vi have label i (at the end of the construction 
process), using the four operations i{v), rj, p and © (see the introduction for the 
definition of these operations). Note that, for instance, the cycle with 4 vertices 
(the G4) is of Clique-width <3, but the G4 labeled 1—1— 2— 2 consecutively 
around the circle is not. 

We say that a graph is 2-labeled if exactly two of the label sets are non-empty, 
and 3-laheled if all three of them are non-empty. 

Without loss of generality, we may assume that our given graphs are prime, 
in the sense that they have no modules. (A module of G is an induced subgraph 
H, \ < \H\ < |G|, such that each vertex in G \ H is either universal to H 
or misses H). This assumption follows from the easily verifiable observation 
(see [CMRa]) that for every graph G which is not a cograph (i.e., is of clique- 
width > 2), and has a module H, cwd{G) = max{cwd{H) , cwd{G \ {H — x))}, 
where x is any vertex of H . 

Given a connected 3-labeled graph, the last operation in a parse tree which 
constructs it must have been a join. In principle this yields three possibilities. 
However, if two different join operations are possible, the graph has a module: 
for example, if both 771^2 and 774^3 are possible, the vertices of label 2 and 3 form 
a module. So since we are only considering prime graphs we can determine the 
last operation of the parse tree. 

Unfortunately we cannot continue this process on the subproblems as deleting 
the join edges may disconnect the graph and leave us with 2-labeled subproblems. 
In fact it turns out that solving Clique-width restricted to 3-labeled graphs is 
no easier than solving it for 2-labeled graphs. 

In contrast, if we attempt to find in this top-down way the parse tree for a 2- 
labeled prime graph, then we never produce a subproblem with only 1 non-empty 
label set, because its vertices would form a module (as the reader may verify). 
This fact motivates the following definition: for partition AU B of V{H), let 2- 
LAB{H : A, B) denote the problem of determining whether cwd{H : A, B) < 3 
(and finding a corresponding decomposition tree). Since A and B form a disjoint 
partition of V{H) we will also denote it as 2-LAB{H : A, —). If we can find a 
polynomial time algorithm to solve this problem, then our problem reduces to 
finding a small set S of possible 2-labelings such that at least one of them is 
of Clique- width <3 iff G is of Clique- width <3. We first discuss how we solve 
2-LAB, then discuss how to use it to solve the general Clique-width <3 problem. 
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3 Labeled Graphs 

The 2-LAB problem is easier to solve than the general Clique- width <3 prob- 
lem, because there are fewer possibilities. The last operation must have been a 
relabeling, and before that edges were created. With our top-down approach, we 
have to introduce a third label set, that is to split one of the two sets or i? in 
such a way that all edges are present between two of the three new sets (Fig. 1 
shows the two possibilities when the set of vertices labeled with 1 is split); and 
there are four ways to do this in general. Each of these four ways corresponds to 
one of the ways of introducing the third label set, namely: consider the vertices 
of A that are universal to B] consider the vertices of B that are universal to A\ 
consider the co-connected components of both A and B (these are the connected 
components of the complement of the set). 




Fig. 1. 2-LAB procedure main idea 



If there is only one possible way of relabeling the vertices and undoing a join, 
then we have a unique way of splitting our problem into smaller problems; we 
do so and continue, restricting our attention to these simpler subproblems. 

The difficulty arises when there is more than one possible join that could 
be undone. As mentioned in the previous section, if all three label sets are non- 
empty, then this possibility will not arise because it would imply the existence of 
a module. In the general case this situation may arise, but again it implies very 
strong structural properties of the graph, which allow us to restrict our attention 
to just one of the final possible joins. The proof that we need consider just one 
final join even when there is more than one possibility will be described in the 
journal version of the paper (or see [WWW]). 

We then remove the edges (adding a join node to the decomposition tree), 
which disconnects the graph, and we can apply again the above algorithm, until 
either we have a full decomposition tree, or we know that the input graph with 
the initial labeling of A and B is not of Clique-width <3. 



4 Algorithm Outline 

We now know how to determine if the Clique-width of {G : A, i?) is < 3 for any 
partition AiJ B of y(G). Our problem thus reduces to finding a small set S of 
possible 2-labelings such that at least one of them is of Clique- width <3 iff G is 
of Clique-width <3. 
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We are interested in the last join operation in a parse tree corresponding 
to a 3-expression which defines G (if such an expression exists); without loss 
of generality we can assume that this is a rji 2 - The first case is when there 
is only one vertex x of label 1. In this case the parse tree is a solution of 2- 
LAB{G : {x},— ). More generally, if the graph obtained by deleting the edges 
from the last join has a 3-labeled connected component, then it turns out that 
there is a simple procedure for finding the corresponding 2-LAB problems. 

Do all graphs with clique-width at most 3 have a parse tree which ends in 
this way? Unfortunately the answer is no. The graph in Fig. 4 has clique-width 3 
but it is easy to show that there is no parse tree formed in the manner described 
above. 



12 3 4 




Fig. 2. An example graph. 



t = ?n, 2 (^ © r) 

I = Pl_»3 O ??1,3(?71,2(1(8) © 2(7)) © 772,3(3(1) © 2(2))^ 
r = p 2^3 O 772,3(^771,2(1(6) © 2(5)) © 771,3(1(3) © 3(4)) j 
Fig. 3. A 3-expression t for the graph of Fig. 4. 



Thus, we need to consider other final joins, when the graph obtained by 
deleting the edges from the last join has no 3-labeled connected component. 
This leads to the notion of cuts (often called joins in the literature) first studied 
by Cunningham [Cun82]. A cut is a disjoint partition of U(G) into {X ■. Y) 
where \X\, |F| > 1 together with the identification of subsets X G X , Y G Y , 
called the boundary sets, where E{G) fl (A x F) = X x U. Note that since we 
assume our graphs are module free, X C X and Y C Y. For the graph in Fig. 4 
A = {1,2,7,8}, A = {2,7},F = {3, 4, 5, 6} and Y = {3,6}. 
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^ 1.2 




Note how the partition of V{G) by a cut {X : Y) is reflected in a parse 
tree of G, where Gi = G[X] and G 2 = G\Y\. This suggest the possibility of 
an algorithm to examine every cut of G and to try to And a parse tree that 
reflects that cut. There are a number of problems with this approach. First, the 
number of cuts may grow exponentially with n. (In particular, consider the graph 
consisting of (n— 1) /2 P 3 S that all share a common endpoint.) Fortunately, as we 
will see later, we only need consider at most 0{n) cuts. Secondly, we would need a 
polynomial time algorithm to And such a set of 0{n) cuts. In fact the algorithm 
by Ma and Spinrad [MS94] does this for us. (Another approach for finding a 
polynomial size set of cuts is described in [WWW].) For any of these cuts (say 
cut {X : y)) we can see if it corresponds to an appropriate decomposition by 
solving 2-LAB{G : XUY, -). 

We now present the formal specifications of our algorithm. 



5 Formal Description 



Our algorithm has the following outline: (Note that although the algorithm is 
described as a recognition algorithm, it can easily be modified to produce a 
3-expression which defines the input graph, if such an expression exists.) 

Given graph J use Modular Decomposition to And the prime graphs 
Ji, . . . ,Jk associated with J. 

for i := 1 to fc 

if -^GWDi{Ji) then 

STOP {cwd{J) > 3) 



STOP {cwd{J) < 3) 
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function CWD‘i{G) 

{This function is true iff prime graph G has cwd{G) < 3.} 

{First see if there is a parse tree with a final join with a 3-laheled connected 
component. } 

for each x G V{G) 

if 2-LAB{G : {x}, -) or 2-LAB{G : {z G iV(x)|A^[z] ^ iV[x]}, -) 
then 

return true 

{Since there is no such parse tree, determine if there is a parse tree for which 
the final join corresponds to a cut.} 

Produce a set of cuts {{Xi : Yi), {X 2 : Y 2 ), • • • , {Xi : Y/)} so that if there is 
a parse tree whose final join corresponds to a cut, there is one whose final join 
corresponds to a cut in this set (using e.g. Ma-Spinrad). 

for i := 1 to I 

if 2-LAB{G : X, U Y„ -) then 
return true 



return false 



6 Correctness and Complexity Issnes 

In this section we give more details regarding the sufficiency, for our purposes, of 
the set of cuts determined by the Ma-Spinrad algorithm and then briefly discuss 
the complexity of our algorithm. 

As mentioned in Sect. 4, the number of cuts in a graph may grow exponen- 
tially with the size of the graph. We prove however, that if none of the cuts 
identified by the Ma-Spinrad algorithm show that cwd(G) < 3 then no cut can 
establish cwd{G) < 3. In order to prove this we first introduce some notation. 

We say that cut {X : Y) is connected if both G[X] and G\Y] are connected, 1- 
disconnected if exactly one of G[X] and G[Y] is disconnected and 2-disconnected 
if both G[X] and G\Y] are disconnected. We say that two cuts {X : Y) and 
{W : Z) cross iff A n W 0, X n Z 0, Y n W ^ 0 and Y n Z yf 0. We denote 
by Cj- the set of cuts produced by the Ma-Spinrad algorithm. Recall that for 
every cut (X : Y) our algorithm calls 2-LAB {G : A U Y, — ) to check whether 
this cut can establish cwd{G) < 3. We denote it as the call to 2-LAB on behalf 
of cut (A : Y). Suppose all the calls to 2-LAB on behalf of the cuts in Cj- failed 
and there is a cut (A : Y) not in Cj- such that the call to 2-LAB on behalf 
of (A : Y) succeeds. We show that there is a cut in Cj- {W : Z) which crosses 
(A : Y). Furthermore we show that if (A : Y) is connected then AUY = WUZ. 
Thus the call to 2-LAB on behalf of cut (A : Y) is the same as the call to 2-LAB 
on behalf of cut {W : Z), a contradiction. Thus (A : Y) is not connected. We 
show that if (A : Y) is 2-disconnected then (A : Y) must be in C 7 -, again a 
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contradiction. Thus {X,Y) must be 1-disconnected. In this case we also reach a 
contradiction, as described in the full version of the paper. 

We now turn briefly to complexity issues. As shown in [CH94] modular de- 
composition can be performed in linear time. The Ma-Spinrad algorithm can be 
implemented in 0{n^) time. Function 2-LAB is invoked 0{n) times. As shown 
in the journal version of the paper, the complexity of 2-LAB is 0{mn); thus the 
overall complexity of our algorithm is 0{n^m). 

There is one case in the 2-LAB procedure where we use a modular decom- 
position tree. Thus for achieving best complexity, a linear modular decompo- 
sition algorithm is needed there. Up to now, no such algorithm is known that 
is also easy to implement. However, if a practical algorithm is sought, one can 
use an 0{n -\- mlogn) algorithm [HPV99]. The complexity of 2-LAB is then 
O(mnlogn), and the overall complexity would be 0(jnn^logn). 

7 Concluding Remarks 

Having shown that the clique-width at most 3 problem is in P, the key open 
problem is to determine whether the fixed clique-width problem is in P for 
constants larger than 3. Even extending our algorithm to the 4 case is a nontrivial 
and open problem. Although, to the best of our knowledge, it has not been 
established yet, one fully expects the general clique-width decision problem to 
be NP-complete. 
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Abstract. The dart is the five-vertex graph with degrees 4, 3, 2, 2, 1. 
An even pair is pair of vertices such that every chordless path between 
them has even length. A graph is perfectly contractile if every induced 
subgraph has a sequence of even-pair contractions that leads to a clique. 
We show that a recent conjecture on the forbidden structures for per- 
fectly contractile graphs is satisfied in the case of dart-free graphs. Our 
proof yields a polynomial-time algorithm to recognize dart-free perfectly 
contractile graphs. 

Keywords: Perfect graphs, even pairs, dart- free graphs, claw-free graphs 



1 Introduction 

A graph G is perfect [1] if every induced subgraph H of G has its chromatic 
number x(Lt) equal to the maximum size oj{H) of the cliques of H. One of 
the most attractive properties of perfect graphs is that some problems that are 
hard in general, such as optimal vertex-coloring and maximum clique number, 
can be solved in polynomial time in perfect graphs, thanks to the algorithm 
of Grotschel, Lovasz and Schrijver [7]. However, that algorithm, based on the 
ellipsoid method, is quite impractical. So, an interesting open problem is to find 
a combinatorially “simple” polynomial-time algorithm to color perfect graphs. 
In such an algorithm, one may reasonably expect that some special structures 
of perfect graphs will play an important role. An even pair in a graph G is 
a pair of non-adjacent vertices such that every chordless path of G between 
them has an even number of edges. The contraction of a pair of vertices x, y 
in a graph G is the process of removing x and y and introducing a new vertex 
adjacent to every neighbor of x or j/ in G. Fonlupt and Uhry [6] proved that 
contracting an even pair in a perfect graph yields a new perfect graph with the 
same maximum clique number. In consequence, a natural idea for coloring a 
perfect graph G is, whenever it is possible, to find an even pair in G, to contract 
it, and to repeat this procedure until a graph G' that is easy to color is obtained. 
By the result of Fonlupt and Uhry, that final graph G' has the same maximum 
clique size as G and (since it is perfect) the same chromatic number. Each 

* This research was partially supported by the cooperation between CAPES (Brazil) 
and COFECUB (France), project number 213/97. The first author is partially sup- 
ported by CNPq-Brazil grant number 301330/97. 
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vertex of G" represents a stable set of G, so one can easily obtain an optimal 
coloring of G from any optimal coloring of G' . For many classical perfect graphs 
one may expect the final graph to be a clique. Thus one may wonder whether 
every perfect graph admit a sequence of even-pair contractions that leads to 
a clique. Unfortunately, the answer to this question is negative (the smallest 
counterexample is the complement of a 6-cycle). 

Bertschi [2] proposes to call a graph G even contractile if it admits a sequence 
of even-pair contractions leading to a clique, and perfectly contractile if every 
induced subgraph of G is even contractile. The class of perfectly contractile 
graphs contains many known classes of perfect graphs, such as Meyniel graphs, 
weakly triangulated graphs, and perfectly orderable graphs, see [4]. 

Everett and Reed [5] have proposed a conjecture characterizing perfectly 
contractile graphs. In order to present it, we need some technical definitions. A 
hole is a chordless cycle of length at least five, and an antihole is the complement 
a hole. A hole or antihole is even (resp. odd) if it has an even (odd) number of 
vertices. We denote by Cq the complement of a hole on six vertices. 

Definition 1 (Stretcher). A stretcher is any graph that can he obtained by 
subdividing the three edges of Gg that do not lie in a triangle in such a way 
that the three chordless paths between the two triangles have the same parity. A 
stretcher is odd (resp. even) if the three paths are odd (even) (see figure 1). 





An even stretcher 



An odd stretcher 



Conjecture 1 (Perfectly Contractile Graph Conjecture [5]). A graph is perfectly 
contractile if and only if it contains no odd hole, no antihole, and no odd 
stretcher. 

Note that there is no even pair in an odd hole or in an antihole, but odd 
stretchers may have even pairs. So, the ‘only if’ part of the conjecture is estab- 
lished if we can check that every sequence of even-pair contractions in an odd 
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stretcher leads to a graph that is not a clique; this is less obvious but was done 
formally in [9]. 

The above conjecture has already been proved for planar graphs [9], for claw- 
free graphs [8] and for bull-free graphs [3] . 

Here, we are interested in the dart-free perfectly contractile graphs. Recall 
that the dart is the graph on five vertices with degree sequence (4, 3,2,2, 1); in 
other words, a dart is obtained from a 4-clique by removing one edge and adding 
a new vertex adjacent to exactly one of the remaining vertices of degree three. 
We will call tips of the dart its two vertices of degree two. 




A graph is dart-free if it does not contain a dart as an induced subgraph. 
Dart-free graphs form a large class of interest in the realm of perfect graphs as 
it contains all diamond-free graphs and all claw-free graphs. Dart-free graphs 
were introduced by Chvatal, and Sun [11] proved that the Strong Perfect Graph 
Conjecture is true for this class, that is, a dart-free graph is perfect if only if it 
contain no odd hole and no odd antihole. Chvatal, Fonlupt, Sun and Zemirline 
[12] devised a polynomial-time algorithm to recognize dart-free graphs. On the 
other hand, the problem of coloring the vertices of a dart-free perfect graph in 
polynomial time using only simple combinatorial arguments remains open. 

We will prove that Everett and Reed’s conjecture on perfectly contractile 
graphs is also true for dart-free graphs, that is: 

Theorem 1 (Main Theorem). A dart- free graph is perfectly contractile if and 
only if it contains no odd hole, no antihole, and no odd stretcher. 

Moreover, we will present a polynomial-time combinatorial algorithm to color 
optimally the perfectly contractile dart-free graphs. In order to prove our main 
theorem, we will use the decomposition structure found by Chvatal, Fonlupt, 
Sun and Zemirline [12]. It is presented in the next section. 

We finish this section with some terminology and notation. We denote by 
N{x) the subset of vertices of G to which x is adjacent. The complement of a 
graph G is denoted by G. If {x, y} is an even pair of a graph G, the graph obtained 
by the contraction of x and y is denoted by G jxy. It will be convenient here to 
call two vertices x,y oi & graph twins when they are adjacent and N{x) U {x} = 
N{y) U {y} (the usual definition of twins does not necessarily require them to be 
adjacent). A claw is a graph isomorphic to the complete bipartite graph ATi, 3 . 
A double-claw is a graph with vertices u\,U 2 ,u^,vi,V 2 and edges V\V 2 and UiVj 




138 C. Linhares Sales, F. Maffray 





A double-claw 



(1 < i < 3, 1 < j < 2). Two twins x,y are called double-claw twins if they 
are the vertices Vi,V 2 in a double-claw as above. The join of two vertex-disjoint 
graphs Gi = (Vi,Ei) and G 2 = (V 2 , E 2 ) is the graph G with vertex-set Vi U V 2 
and edge-set EiU E 2 U F, where F is set of all pairs made of one vertex of Gi 
and one vertex of G 2 . 



2 Decomposition of Dart-Pree Perfect Graphs 

We present here the main results from [12] and adopt the same terminology. We 
call DART-FREE the algorithm from [12] to recognize dart-free perfect graphs. 

When a graph G has a pair of twins x, y, Lovasz’s famous Replication Lemma 
[10] ensures that G is perfect if and only if G — a; {or G — y) is perfect. So, the 
initial step of algorithm dart-free is to remove one vertex among every pair of 
twins in the graph. Dart-free graphs without twins have some special properties. 

Definition 2 (Friendly graph [12]). A graph G is friendly if the neighborhood 
N{x) of every vertex x of G that is the center of a claw induces vertex-disjoint 
cliques. 



Theorem A ([12]) Let G be a dart-free graph without twins. If G and G are 
connected, then G is friendly. 



Theorem B ([12]) A graph G is friendly if and only if it contains no dart and 
no pair of double-claw twins. 

Let G be a dart-free graph. Let W be the subset of all vertices of G that have 
at least one twin, and let T be a subset of W such that every pair of twins of 
G has at least one of them in T. Using Theorem A, one can find in polynomial 
time a family T of pairwise vertex-disjoint friendly graphs such that: (a) the 
elements of T are induced subgraphs of G — T, and (b) G is perfect if and only 
if every element of T is perfect. This family can be constructed as follows: first, 
put G — T in T] then, as long as there exists an element H of E such that either 
H or H is disconnected, replace in T the graph El by its connected components 
(if H is disconnected) or by the complements of the connected components of H 
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(if H is disconnected). In consequence, the problem of deciding whether a dart- 
free graph is perfect is reduced to deciding whether a friendly graph is perfect 
or not. For this purpose, friendly graphs are decomposed further. 

Definition 3 (Bat [12]). A bat is any graph that can be formed by a chordless 
path 0 x 02 • • • am (m> 6) and an additional vertex z adjacent to ai, Oj, Ox+i and 
am for some i with 3 < i < m — 3 and to no other vertex of the path. A graph G 
is bat-free if it does not contain any bat as an induced subgraph. 

Given a graph G and a vertex z, a z-edge is any edge whose endpoints are 
both adjacent to a given vertex z. The graph obtained from G by removing 
vertex z and all z-edges is denoted by G * z. 

Definition 4 (Rosette [12]). A graph G is said to have a rosette centered at 
a vertex z of G if G * z is disconnected and the neighborhood of z consists of 
vertex-disjoint cliques. 



Theorem C ([12]) Every friendly graph G containing no odd hole either is 
bat-free, or has a clique-cutset, or has a rosette. 



Definition 5 (Separator [12]). A separator S is a cutset with at most two 
vertices such that, if S has two non-adjacent vertices, each component of G — S 
has at least two vertices. 



Theorem D ([12]) Every bat-free friendly graph G containing no odd hole ei- 
ther is bipartite, or is claw-free, or has a separator. 

A decomposition of G along special cutsets can be defined as follows: 

— Clique-cutset decomposition: Let G be a clique-cutset of G and let B\, . . ., 
Bk be the connected components of G — G. The graph G is decomposed 
into the pieces of G with respect to G, which are the induced subgraphs 
Gi = G[B,UC] {t=l,...,k). 

— Rosette decomposition: Consider a rosette centered at a vertex z of G, and 
let Bi,...Bk {k > 2) be the connected components of G * z. The graph 
G is decomposed into fc -I- 1 graphs Gi, . . . ,Gk, B defined as follows. For 
i = l,...,fc, the graph Gi is G[Bi U {z}]. The graph E[ is formed from 
G[A^(z)] by adding vertices w\,...,Wk and edges from wi to all of N{z) fl Bi 
{i = l,...,k). 

— Separator decomposition: When S' is a separator of size one or two with its 
two vertices adjacent, S is a clique-cutset and the decomposition is as above. 
When S = {u, v} is a separator of G with u, v non-adjacent, let Bi,. . . ,Bk 
be the components of G — S, and let P be a chordless path between u and 
v in G. The graph G is decomposed into k graphs Gi,...,Gfe defined as 
follows. If P is even, Gi is obtained from G[Bi U S] by adding one vertex Wi 
with edges to u and v. If P is odd, set Gi = G[Bi U S] -I- uv. 
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Algorithm Bat-free builds a decomposition tree T of a friendly graph G. At 
the initial step, G is the root and the only node of the tree. At the general step, 
let G' be any node of T. If G' can be decomposed by one of the special cutsets 
(clique or rosette), then add in T, as children of G', the graphs into which it is 
decomposed. More precisely, the clique-cutset decomposition is applied first, if 
possible; the rosette decomposition is applied only if the clique-cutset decomposi- 
tion cannot be applied. Since each leaf iJ of T is friendly and has no clique-cutset 
and no rosettes. Theorem C ensures that either H is bat-free or G was not per- 
fect. So, the second phase of the algorithm examines the leaves of T: each leaf H 
of T must either be bipartite, or be claw-free, or contain a separator, or else G is 
not perfect, by Theorem D. If H contains a separator, a separator decomposition 
is applied. When no separator decomposition is possible, G is perfect if and only 
if all the remaining leaves of T are either bipartite or claw-free. 

3 Dart-Pree Perfectly Contractile Graphs 

This section is dedicated to the proof of Theorem 1 . In this proof, we will use the 
decomposition of dart-free perfect graphs obtained by the bat-free algorithm. 
We organize this section following the steps of the decomposition. First, we 
examine the friendly graphs. 

Theorem 2. A friendly graph G is perfectly contractile if and only if it contains 
no odd stretcher, no antihole and no odd hole. 

Proof. As observe at the beginning, no perfectly contractile graph can contain 
an odd hole, an antihole, or an odd stretcher. Conversely, suppose that G has 
no odd stretcher, no antihole and no odd hole as induced subgraph, and let us 
prove by induction on the number of vertices of G that G is perfectly contractile. 
The fact is trivially true when G has at most six vertices. In the general case, 
by Theorem C, G either is bat-free or has a clique-cutset or a rosette. We are 
going to check each of these possibilities. The following lemmas prove Theorem 2 
respectively when G has a clique-cutset and when G has a rosette. Their proofs 
are omitted and will appear in the full version of the paper. 

Lemma 1. Let G be a friendly graph, with no odd hole, no odd stretcher and 
no antihole. If G has a clique cutset, then G is perfectly contractile. 

Lemma 2. Let G he a friendly graph with no odd hole, no antihole and no odd 
stretcher. If G has a rosette centered at some vertex z, then: 

(i) Every piece of G with respect to z is perfectly contractile; and 

(ii) G is perfectly contractile if every piece of G has a sequence of even-pair 
contractions leading to a clique such that each graph g in this sequence either is 
dart- free or contains a dart whose tips form an even pair of g. 

Lemma 3. Let G be a friendly graph with no odd hole, no antihole and no odd 
stretcher. If G is bat-free, then G is perfectly contractile. 
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For this lemma, we prove that: (a) If G has a separator S, then G is perfectly 
contractile; (b) If G is bipartite or claw-free, G is perfectly contractile. These 
facts imply Lemma 3. 

The following lemmas will ensure the existence of a sequence of even-pair 
contractions whose intermediate graphs are dart-free. 

Lemma 4. Every bipartite graph admits a sequence of even-pair contractions 
that lead to a clique and whose intermediate graphs are dart-free. 

Lemma 5. Every claw-free graph admits a sequence of even-pair contractions 
that lead to a clique and whose intermediate graphs either are dart-free or have 
a dart whose tips form an even pair. 

Lemma 4 is trivial; the proof of Lemma 5 is based on the study of even pairs in 
claw- free graphs that was done in [8]. 

Lemmas 4 and 5 imply that every friendly graph that contains no odd hole, 
no antihole and no odd stretcher admits a sequence of even-pair contractions 
whose intermediate graphs are dart-free. Therefore, Lemmas 1, 2 and 3 together 
imply Theorem 2. 

3.1 An Algorithm 

Now we give the outline of an even-pair contraction algorithm for a friendly 
graph G without odd holes, antiholes or odd stretchers. The algorithm has two 
main steps: constructing the decomposition tree, then contracting even pairs in 
a bottom-up way along the tree. 

In the first step, the algorithm uses a queue Q that initially contains only G. 
While Q is not empty, a graph G' of Q is dequeued and the following sequence of 
steps are executed at the same time that a decomposition tree T is being built: 

1. If G' has a clique-cutset G, put the pieces of G' with respect to C in Q; 
repeat the first step. 

2. If G' has a rosette centered at 2 , put in Q all the pieces of G' with respect 
to the rosette; except iL; repeat the first step. 

3. If G' has a separator {a, b} and {a, b} forms an even pair, contract a and 6, 
put G' fab in Q; repeat the first step. If {a, b} forms an odd pair, put G' -\- ab 
in Q; repeat the first step. 

The second step examines the tree decomposition T in a bottom-up way. For 
each leaf G" of T, we have: 

1. If G' is a bipartite graph, then a sequence of even-pair contractions that 
turns G' into a K 2 is easily obtained. 

2. If G' is a claw-free graph, then a sequence of even-pair contractions that 
turns G' into a clique can be obtained by applying the algorithm described 
in [8]. 

Now, since every leaf is a clique, we can glue the leaves by the cutsets that 
produced them, following the tree decomposition. Three cases appear: 
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1. Suppose that Gi, . . . ,Gk are the pieces produced by a separator {a, b}. Since 
Gi, . . . ,Gk are cliques, glueing Gi, . . . , by {a, b} (more exactly, by vertices 
that correspond to a and b) will produce a graph with a clique-cutset and 
such that every piece is a clique. This is a special kind of triangulated graph, 
in which a sequence of even-pair contractions leading to a clique can easily 
be obtained. 

2. Suppose that Gi, . . . , G^ are the pieces produced by a rosette and let G' be 
the graph obtained by glueing all these pieces (which are cliques) according 
to the rosette. Let G" be the graph obtained from G' by removing (i) every 
vertex that sees all the other vertices, and (ii) every vertex whose neighbours 
form a clique; it is easy to check that any sequence of even-pair contractions 
for G" yields a sequence of even-pair contractions for G'. Moreover, we can 
prove that G" is friendly and bat-free; so it must either have a separator 
or be a bipartite graph or a claw-free graph. Thus, an additional step of 
decomposition and contraction will give us the desired sequence of even-pair 
contractions for G". 

3. Suppose that Gi, . . . , G^ are the pieces produced by a clique-cutset. Again, 
the graph G' obtained by glueing these pieces along the clique-cutset is a 
triangulated graph for which a sequence of even-pair contractions that turns 
it into a clique can be easily obtained. 

Finally, we can obtain a sequence of even-pair contractions for G by concate- 
nating all the sequences mentioned above. 



Proof of Theorem 1 

Let G be a dart-free graph with no odd hole, no antihole and no odd stretcher. 
If G has no twins and G and G are connected, then G is friendly and so, by 
Theorem 2, perfectly contractile. If G is disconnected then we can show that G 
has a very special structure which is easy to treat separately. If G is disconnected, 
it is sufficient to argue for each component of G. Hence we are reduced to the 
case where G is connected and is obtained from a friendly graph by replication 
(making twins). We conduct this proof by induction and along three steps. First, 
we modify slightly the construction of family T described in section 2. As we 
have seen, T was obtained from a dart-free graph without twins. Unfortunately, 
twins cannot be bypassed easily in the question of perfect contractibility (Note: 
it would follow from Everett and Reed’s conjecture that replication preserves 
perfect contractibility; but no proof of this fact is known). However, by Theorem 
B, we need only remove double-claw twins from a dart-free graph G to be able 
to construct a family of friendly graphs from G. It is not hard to see that if 
a dart-free graph G such that G and G are connected contains a double claw, 
then it must contain double-claw twins (see the proof of Theorem A in [12]). So 
Theorem A can be reformulated as follows: 

Theorem E ([12]) Let G be a dart- free graph without double-claw twins. If G 
and G are connected, then G is friendly. 
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Therefore, instead of removing all the twins of a dart-free graph G, we can afford 
to remove only the double-claw twins, as follows: initialize G' = G and T = 0; as 
long as G' has a pair of double-claw twins x, y, set G' = G' — x and T = T + x. 
Observe that if G contains no odd stretcher, no antihole and no odd hole, then 
so does G' . Now let the family T' be obtained from G' just like T was obtained 
in the previous section. 

The second step is to examine each friendly graph F oi T' , to add back 
its twins and to prove that it is perfectly contractile. Since G' contains no odd 
stretcher, no antihole and no odd hole, so does F\ hence, by Theorem 2, F is 
perfectly contractile. Denote by Tp the set of double-claw twins of F, and by 
F + Tp the graph obtained by adding back the twins in F . Since F is friendly, 
we can consider the tree decomposition of F . Five cases appear: 

(1) F contains a clique-cutset. Then F + Tp contains a clique-cutset G. 
Every piece of F + Tp with respect to G is an induced subgraph of G, and so 
it is perfectly contractile. Moreover, clearly, each even pair of F -|- Tp is an even 
pair of the whole graph. 

(2) F contains a rosette (centered at a vertex z). Suppose first that 2; has 
no twins in Tp. Then F -|- Tp also contains a rosette centered at 2;, and the 
proof works as usual. Now suppose that 2: has a set of twins T{z) G Tp. We 
can generalize the rosette in following way: remove the vertices z + T{z) and all 
the 2: edges, and construct the pieces Gi, . . . , Gfc as before, except that 2: -I- T(2:) 
(instead of 2; alone) lies in every piece. Each piece is an induced subgraph of 
G, so it is perfectly contractile. Moreover we can prove that each even pair in a 
piece is an even pair of the whole graph. The desired result can then be obtained 
as in the twin-free case above. 

(3) F has a separator {a, b}. Let A = {a, oi, . . . , o/} and B = {6, 61, ... , br} 
(r > / > 0) be the sets of twins of a and b respectively. If {a, b} is an even pair, 
then we do the following sequence of contractions: {a,b}, {ai,bi}, . . . , {apbi}. 
A lemma (whose proof is omitted here) ensures that this is a valid sequence of 
even-pair contractions. The result is a graph with a clique-cutset GAR, where 
G consists of the I contracted vertices and R is made of the r — I remaining 
vertices of B. For each piece Gi of this graph with respect to G, the graph 
Gi — i? is isomorphic to a piece of F/ab with vertex ab replicated I times. This 
fact, together with the fact that all the pieces of F with respect to {a, b} are 
perfectly contractile, and with the induction hypothesis, implies that F + Tp is 
perfectly contractile. If {a, 6} is an odd pair, a different type of modification of 
the construction of the pieces is introduced; we skip this subcase for the sake of 
shortness. 

(4) F is bipartite. Then the vertices of F -|- Fp can be divided into two sides 
such that every connected component of each side is a clique and every clique 
from one side sees all or none of the vertices of any clique on the other side. It 
is easy to check directly that such a graph is perfectly contractile. 

(5) F is claw-free. Then F -|- Fp is claw-free. So F -|- Fp, as F, is perfectly 
contractile. 
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Lemma 6. If F is a perfectly contractile friendly graph, then F +Tp is perfectly 
contractile. 

The third and last step of the proof of Theorem 1 is the following lemma: 

Lemma 7. A dart-free graph G' without double-claw twins is perfectly contrac- 
tile if and only if every friendly graph H of F' is perfectly contractile. 

Finally, given a dart-free graph G that contains no odd hole, no antihole 
and no odd stretcher, we obtain a graph G' that contains no double-twins and 
is decomposable into friendly graphs. By Theorem 2 these graphs are perfectly 
contractile, and by Lemma 6, adding the twins back to these graphs preserves 
their perfectly contractability. So, the modified family T' is a set of perfectly 
contractile graphs. By Lemma 7, G is perfectly contractile, and the proof of 
Theorem 1 is now complete. 

4 Conclusion 

The many positive results gathered in the past few years about Conjecture 1 
(see [4]) motivate us to believe strongly in its validity and to continue our study 
of this conjecture. 
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Abstract. The chromatic index problem - finding the minimum number 
of colours required for colouring the edges of a graph - is still unsolved 
for indifference graphs, whose vertices can be linearly ordered so that 
the vertices contained in the same maximal clique are consecutive in 
this order. Two adjacent vertices are twins if they belong to the same 
maximal cliques. A graph is reduced if it contains no pair of twin ver- 
tices. A graph is overfull if the total number of edges is greater than the 
product of the maximum degree by \n/ 2 \, where n is the number of ver- 
tices. We give a structural characterization for neighbourhood-overfull 
indifference graphs proving that a reduced indifference graph cannot be 
neighbourhood-overfull. We show that the chromatic index for all re- 
duced indifference graphs is the maximum degree. 



1 Introduction 

In this paper, G denotes a simple, undirected, finite, connected graph. The sets 
V{G) and E{G) are the vertex and edge sets of G. Denote |P(G)| by n and 
\E{G)\ by m. A graph with just one vertex is called trivial. A clique is a set of 
vertices pairwise adjacent in G. A maximal clique of G is a clique not properly 
contained in any other clique. A subgraph of G is a graph E[ with V{E[) C V{G) 
and E{H) C E{G). For X C V{G), denote by G[X] the subgraph induced by X, 
that is, V{G[X]) = X and E{G[X]) consists of those edges of i?(G) having both 
ends in X. For Y C E{G), the subgraph induced by Y is the subgraph of G whose 
vertex set is the set of endpoints of edges in Y and whose edge set is Y ; this 
subgraph is denoted by G\Y], The notation G\Y denotes the subgraph of G 
with V{G\Y) = V{G) and E{G \Y) = E{G) \ Y . A graph G is El -free if G 
does not contain an isomorphic copy of i? as an induced subgraph. Denote by 
G„ the chordless cycle on n vertices and by 2K2 the complement of the chordless 
cycle G4. A matching M of G is a set of pairwise non adjacent edges of G. A 
matching M of G covers a set of vertices A of G when each vertex of X is 
incident to some edge of M. The graph G[M] is also called a matching. 

For each vertex -u of a graph G, the adjacency AdjQ(u) of v is the set of 
vertices that are adjacent to v. The degree of a vertex v is deg(w) = | Adjg.(r!)|. 
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The maximum degree of a graph G is then A{G) = ma,Xy^v{G) deg(f). We use 
the simplified notation A when there is no ambiguity. We call A-vertex a vertex 
with maximum degree. The set N[v] denotes the neighbourhood of v, that is, 
N[v] = Adj( 5 (f) U {f}. A subgraph induced by the neighbourhood of a vertex 
is simply called a neighbourhood. We call A-neighbourhood the neighbourhood 
of a A- vertex. Two vertices v e w are twins when N[v] = Equivalently, 

two vertices are twins when they belong to the same set of maximal cliques. A 
graph is reduced if it contains no pair of twin vertices. The reduced graph G' of 
a graph G is the graph obtained from G by collapsing each set of twins into a 
single vertex and removing possible resulting parallel edges and loops. 

The chromatic index X^G) of a graph G is the minimum number of colours 
needed to colour the edges of G such that no adjacent edges get the same colour. 
A celebrated theorem by Vizing [12, 10] states that X^(G) is always A or A + 1. 
Graphs with x'(G) = A are said to be in Class 1; graphs with x^(G) = A + 1 
are said to be in Class 2. A graph G satisfying the inequality m > A(G)[n/2j, 
is said to be an overfull graph [8]. A graph G is subgraph- overfull [8] when it has 
an overfull subgraph H with A{H) = A(G). When the overfull subgraph H can 
be chosen to be a neighbourhood, we say that G is neighbourhood- overfull [4]. 
Overfull, subgraph-overfull, and neighbourhood-overfull graphs are in Class 2. 

It is well known that the recognition problem for the set of graphs in Class 1 
is NP-complete [9] . The problem remains NP-complete for several classes, includ- 
ing comparability graphs [1]. On the other hand, the problem remains unsolved 
for indifference graphs: graphs whose vertices can be linearly ordered so that the 
vertices contained in the same maximal clique are consecutive in this order [11]. 
We call such an order an indifference order. Given an indifference graph, for 
each maximal clique A, we call maximal edge an edge whose endpoints are the 
first and the last vertices of A with respect to an indifference order. Indifference 
graphs form an important subclass of interval graphs: they are also called unitary 
interval graphs or proper interval graphs. The reduced graph of an indifference 
graph is an indifference graph with a unique indifference order (except for its 
reverse). This uniqueness property was used to describe solutions for the recog- 
nition problem and for the isomorphism problem for the class of indifference 
graphs [2]. 

It has been shown that every odd maximum degree indifference graph is 
in Class 1 [4] and that every subgraph-overfull indifference graph is in fact 
neighbourhood-overfull [3]. It has been conjectured that every Class 2 indif- 
ference graph is neighbourhood-overfull [4,3]. Note that the validity of this con- 
jecture implies that the edge-colouring problem for indifference graphs is in P. 

The goal of this paper is to investigate this conjecture by giving another 
positive evidence for its validity. We describe a structural characterization for 
neighbourhood-overfull indifference graphs. This structural characterization im- 
plies that no reduced indifference graph is neighbourhood-overfull. We prove 
that all reduced indifference graphs are in Class 1 by exhibiting an edge colour- 
ing with A colours for every indifference graph with no twin A- vertices. In order 
to construct such an edge colouring with A colours, we decompose an arbitrary 
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indifference graph with no twin Z\- vertices into two indifference graphs: a match- 
ing covering all Z\- vertices and an odd maximum degree indifference graph. 

The characterization for neighbourhood-overfull indifference graphs is de- 
scribed in Section 2. The decomposition and the edge colouring of indifference 
graphs with no twin Z\- vertices is in Section 3. Our conclusions are in Section 4. 



2 Neighbourhood-Overfull Indifference Graphs 

In this section we study the overfull Z\-neighbourhoods of an indifference graph. 
Since it is known that every odd maximum degree indifference graph is in 
Class 1 [4], an odd maximum degree indifference graph contains no overfull 
Z\-neighbourhoods. We consider the case of even maximum degree indifference 
graphs. A nontrivial complete graph with even maximum degree A is always 
an overfull Z\-neighbourhood. We characterize the structure of an overfull A- 
neighbourhood obtained from a complete graph by removal of a set of edges. 

Theorem 1. Let A'a+i be a complete graph with even maximum degree A. Let 
F = AT/i+i \ R, where R is a nonempty subset of edges of K^+i- Then, the 
graph F is an overfull indifference graph with maximum degree A if and only if 
H = G[i?] is a 2K2~free bipartite graph with at most Aj2 — 1 edges. 

The proof of Theorem 1 is divided into two lemmas. 

Lemma 1. Let AT/i+i be a complete graph with even maximum degree A. Let R 
be a nonempty subset of edges of Ka.+i- If F = Ka+i \ R is an overfull indif- 
ference graph with maximum degree A, then H = G[R\ is a 2K2~free bipartite 
graph and |i?| < A/2. 

Proof. Let i? be a nonempty subset of edges of AT/i+i, a complete graph with 
even maximum degree A. Let F = iL/i+i \ i? be an overfull indifference graph 
with maximum degree A. Note that Z\ > 2. 

Because F is an overfull graph, |C(F)| is odd and there are at most A/ 2 — 1 
missing edges joining vertices of F. Hence |i?| < A/ 2. 

Suppose, by contradiction, that the graph H = G[A] contains a 2 K 2 as an 
induced subgraph. Then F contains a chordless cycle G 4 as an induced subgraph, 
a contradiction to F being an indifference graph. Since iL is a graph free of 2 K 2 , 
we conclude that the graph H does not contain the chordless cycle Ck, k > 6. 
We show that the graph H does not contain neither a G 5 nor a G 3 as an induced 
subgraph. Assume the contrary. If H contains a G 5 as an induced subgraph, then 
F contains a G 5 as an induced subgraph, since G 5 is a self-complementary graph, 
a contradiction to F being an indifference graph. If H contains a G 3 as an induced 
subgraph, then F contains a ATi ,3 as an induced subgraph, since, by hypothesis, 
F has at least one vertex of degree A, a contradiction to F being an indifference 
graph. Therefore, H is a bipartite graph, without 2 K 2 , and |i?| < A/2. □ 
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Lemma 2. Let Ka+i he a complete graph with even maximum degree A. If 
H = G[R] is a 2K2~free bipartite graph induced by a nonempty set R of edges 
of size |i?| < A/2, then F = \ R is an overfull indifference graph with 

maximum degree A. 

Proof. By definition of F and because |i?| < A/2, we have that F is an overfull 
graph and it has vertices of degree A. We shall prove that F is an indifference 
graph by exhibiting an indifference order on the vertex set V{F) of F . 

Since H = G[R] is a 2K2~bee bipartite graph, H is connected with unique bi- 
partition of its vertex set into sets X and Y . Now, label the vertices X\,X 2 , ■ ■ ■ ,Xk 
of X and label the vertices yi,y 2 , ■ ■ ■ ,yi of F according to degree ordering-, la- 
bels correspond to vertices in no increasing vertex degree order, i.e., deg(a;i) > 
deg(a; 2 ) > • • • > deg(xfc) and deg(yi) > deg(y 2 ) > • • • > deg(y<?), respectively. 

This degree ordering induces the following properties on the vertices of the 
adjacency of each vertex of X and Y : 

— The adjacency of a vertex of H defines an interval on the degree order, i.e., 

Adj^(xi) = {yj : I < p < j < p + q < i} and Adj^(yj) = : 1 < r < z < 

r -I- s < k}. 

Indeed, let a be a vertex of H such that Adjjj(a) is not an interval. Then 
Adj^(a) has at least two vertices b and d, and there is a vertex c such 
that ac ^ R between b and d. Without loss of generality, suppose that 
deg(6) > deg(c) > deg(d). Since deg(c) > deg(d), there is a vertex e such 
that e is adjacent to c but is not adjacent to d. It follows that, when either 
deg(e) < deg(a) or deg(a) < deg(e), H has an induced 2 K 2 (ec and ad), a 
contradiction. 

~ The adjacency-sets of the vertices of H are ordered with respect to set inclu- 
sion according to the following containment property: Adj^(xi) A Adj^(a; 2 ) 
A • • • D Adj^(xfe) and Adj^(yi) A Adj^(?/ 2 ) 2 ■ 2 Adjjj(j/^). 

For, suppose there are a and 6 in A with deg(a) > deg(6) and Adjjj(a) 2 
Adj^(5). Hence, there are vertices c and d such that c is adjacent to a but 
not to b, and d is adjacent to b but not to a. The edges ac and bd induce a 
2 K 2 in H, a contradiction. 

— xiyi is a dominating edge of H, i.e., every vertex of H is adjacent to x\ or 
to yi. 

This is a direct consequence of the two properties above. 

When H is not a complete bipartite graph, let i and j be the smallest indices 
of vertices of X and Y, respectively, such that xtyj is not an edge of H. Note 
that, because Xiyi is a dominating edge of H, we have i and j greater than 1. 
Define the following partition of V{H): 

A := {xi,X2 , . . . , Xi_i}; S := {xi, Xi+i, ..., Xk}', 

B := {yi,y 2 ,...,yj-i}; T := {yj,yj+i, . . . ,yt}. 

Note that S and T can be empty sets and that the graph induced by A U H 
is a complete bipartite subgraph of H. 
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Now we describe a total ordering on V{F) as follows. We shall prove that 
this total ordering gives the desired indifference order. 

— First, list the vertices of X = A U S' as X\,X 2 , ■ ■ ■ ,Xk] 

— Next, list all the Z\-vertices of F, vi,V 2 , - ■ ■ , Vg] 

— Finally, list the vertices of F = T U B as yi,yi-i, ■ ■ ■ ,yi- 

The ordering within the sets A, S, D, T, B, where D denotes the set of the 
Z\- vertices of F, is induced by the ordering of V{F). 

By the containment property of the adjacency-sets of the vertices of and 
because each adjacency defines an interval, the consecutive vertices of the same 
degree are twins in F. Hence, it is enough to show that the ordering induced on 
the reduced graph F' of F is an indifference order. 

For simplicity, we use the same notation for vertices of F and F', i.e., we 
call vertices in F' corresponding to vertices of X by x'i,X 2 , ■ ■ ■ , x'f ., , and we 
call vertices corresponding to vertices of F by y'i,y 2 , ■ ■ ■ ,y'i,. Note that the set 
D contains only one representative vertex in F' , and we denote this unique 
representative vertex by v'-^. 

By definition of F' , x'^v^^x'^y'^,, . . . , a;'?/' , . . . , x'^,y^, v[y[ are edges 

of F' . Since vertex v[ is a representative vertex of a Z\- vertex of F, it is also a 
Z\-vertex of F'. Thus v'^ is adjacent to each vertex of F' . Each edge listed above, 
distinct from x'-^v'^ and has form x^y'^. We want to show that x'p is adjacent 
to all vertices from x'^j^i up to y' with respect to the order. For suppose, without 
loss of generality, that x'^ is not adjacent to some vertex z between x'^ and y^ 
with respect to the order. Now by the definition of the graphs F and FI, every 
edge of Ka +1 not in F belongs to graph F[. Since H is a bipartite graph, with 
bipartition of its vertex set into sets X and F, we have in F all edges linking 
vertices in X, and so we have z yf Xg, p < s < k. Vertex z is also distinct from 
Vsj Q ^ s < I, by the properties of the adjacency in F[. Hence, x'^ is adjacent to 
all vertices from Xp_^^ up to y' with respect to the order. It follows that each 
edge listed above defines a maximal clique of F'. Hence, this ordering satisfies 
the property that vertices belonging to the same maximal clique are consecutive 
and we conclude that this ordering on V(F') is the desired indifference order. 
This conclusion completes the proofs of both Lemma 2 and Theorem 1. □ 



Corollary 1. Let G be an indifference graph. A A-neighbourhood of G with at 
most Aj2 vertices of maximum degree is not neighbourhood- overfull. 

Proof. Let F be a Z\-neighbourhood of G with at most A/2 vertices of degree A. 
If A is odd, then F is not neighbourhood-overfull. If Z\ is even, then we use the 
notation of Lemma I and Lemma 2. The hypothesis implies |V|-|-|F| > (Z\/2)-|-l. 
Since vertex xi misses every vertex of F and since vertex yi misses every vertex 
of X, there are at least |V| -|- |F| — 1 missing edges having as endpoints Xi or yi. 
Hence, there are at least |W| -I- |F| — 1 > A/2 missing edges in F, and F cannot 
be neighbourhood-overfull. □ 
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Corollary 2. An indifference graph with no twin A-vertices is not neighbour- 
hood- overfull. 

Proof. Let G be an indifference graph with no twin Z\- vertices. The hypothesis 
implies that every Z\-neighbourhood F oi G contains precisely one vertex of 
degree Z\. Now Corollary 1 says F is not neighbourhood-overfull and therefore 
G itself cannot be neighbourhood-overfull. □ 

Corollary 3. A reduced indifference graph is not neighbourhood- overfull. □ 

3 Reduced Indifference Graphs 

We have established in Corollary 2 of Section 2 that an indifference graph with 
no twin Z\-vertices is not neighbourhood-overfull, a necessary condition for an 
indifference graph with no twin Z\-vertices to be in Class 1. In this section, we 
prove that every indifference graph with no twin Z\- vertices is in Class 1. We 
exhibit a Z\-edge colouring for an even maximum degree indifference graph with 
no twin Z\- vertices. Since every odd maximum degree indifference graph is in 
Class 1, this result implies that all indifference graphs with no twin Z\-vertices, 
and in particular that all reduced indifference graphs are in Class 1. 

Let El, . . . ,Ek he a, partition of the edge set of a graph G. It is clear that if 
the subgraphs G[Ei], 1 < i < k, satisfy A{G) = A{G[Ei\) and, if for each i, 
G[Ei] is in Class 1, then G is also in Class 1. We apply this decomposition 
technique to our given indifference graph with even maximum degree and no 
twin Z\-vertices. 

We partition the edge set of an indifference graph G with even maximum 
degree A and with no twin Z\- vertices into two sets Ei and E 2 , such that 
Gi = G[Ei] is an odd maximum degree indifference graph and G 2 = G[E 2 ] 
is a matching. 

Let G be an indifference graph and V\,V 2 , ■ . ■ ,Vn an indifference order for G. 
By definition, an edge ViVj is maximal if there does not exist another edge VkVi 
with k < i and j < i. Note that an edge ViVj is maximal if and only if the edges 
Vi-iVj and WjWj+i do not exist. In addition, every maximal edge ViVj defines 
a maximal clique having Vi as its first vertex and Vj as its last vertex. Thus, 
every vertex is incident to zero, one, or two maximal edges. Moreover, given an 
indifference graph with an indifference order and an edge that is maximal with 
respect to this order, the removal of this edge gives a smaller indifference graph: 
the original indifference order is an indifference order for the smaller indifference 
graph. 

Based on Lemma 3 below, we shall formulate an algorithm for choosing a 
matching of an indifference graph with no twin Z\-vertices that covers every 
Z\-vertex of G. 

Lemma 3. Let G be a non trivial graph. If G is an indifference graph with no 
twin A-vertices, then every A-vertex of G is incident to precisely two maximal 
edges. 
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Proof. Let G be a non trivial indifference graph without twin Z\-vertices and let 
u be a Z\- vertex of G. Consider vi,V 2 , ■ ■ ■ ,Vn an indifference order for G. Because 
is a Z\-vertex of G and G is not a clique, we have v = Vj, with j ^ 1, n. Let Vi 
and Vk be the leftmost and the rightmost vertices with respect to the indifference 
order that are adjacent to Vj, respectively. Suppose that ViVj is not a maximal 
edge. Then Vi-iVj or ViVj+i is an edge in G. The existence of Vi-iVj contradicts 
Vi being the leftmost neighbour of Vj. Because Vj and vj+i are not twins, the 
existence of ViVj+i implies deg(uj+i) > Z\+ 1, a contradiction. Analogously, we 
have that VjVk is also a maximal edge. □ 

We now describe an algorithm for choosing a set of maximal edges that covers 
all Z\-vertices of G. 



Input: an indifference graph G with no twin Z\- vertices with an indifference order 
Vi,. .. ,Vn of G. 

Output: a set of edges M that covers all Z\-vertices of G. 

1. For each Z\-vertex of G, say Vj, in the indifference order, put in a set £ the 
edge ViVj, where Vi is its leftmost neighbour with respect to the indifference 
order. Each component of the graph G[£] is a path. (Each component H of 
G[£] has A(iJ) < 2 and none of the components is a cycle, by the maximality 
of the chosen edges.) 

2. For each path component P of G[£], number each edge with consecutive 
integers starting from 1. If a path component Pi contains an odd number 
of edges, then form a matching Mi of G[£] choosing the edges numbered by 
odd integers. If a path component Pj contains an even number of edges, then 
form a matching Mj choosing the edges numbered by even integers. 

3. The desired set of edges M is the union (J^, Mk- 



We claim that the matching M above defined covers all Z\- vertices of G. For, 
if a path component of G[£] contains an odd number of edges, then M covers all 
of its vertices. If a path component of G[£] contains an even number of edges, 
then the only vertex not covered by M is the first vertex of this path component. 
However, by definition of G\£], this vertex is not a Z\-vertex of G. 

Theorem 2. If G is an indifference graph with no twin A-vertices, then G is 
in Class 1. 

Proof. Let G be an indifference graph with no twin Z\-vertices. If G has odd 
maximum degree, then G is in Class 1 [4]. 

Suppose that G is an even maximum degree graph. Let vi,...,Vn be an 
indifference order of G. Use the algorithm described above to find a matching M 
for G that covers all Z\-vertices of G. The graph G \ M is an indifference graph 
with odd maximum degree because the vertex sets of G and G \ M are the 
same and the indifference order of G is also an indifference order for G \ M . 
Moreover, since M is a matching that covers all Z\- vertices of G, we have that 
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A{G \ M) = Z\ — 1 is odd. Hence, the edges of G \ M can be coloured with 
A — 1 colours and one additional colour is needed to colour the edges in the 
matching M. This implies that G is in Class 1. □ 

Corollary 4. All reduced indifference graphs are in Class 1. □ 

4 Conclusions 

We believe our work makes a contribution to the problem of edge-colouring 
indifference graphs in three respects. 

First, our results on the colouring of indifference graphs show that, in all 
cases we have studied, neighbourhood-overfullness is equivalent to being Class 2, 
which gives positive evidence to the conjecture that for any indifference graph 
neighbourhood-overfullness is equivalent to being Class 2. It would be interesting 
to extend these results to larger classes. We established recently [5] that every 
odd maximum degree dually chordal graph is Class 1 . This result shows that our 
techniques are extendable to other classes of graphs. 

Second, our results apply to a subclass of indifference graphs defined recently 
in the context of clique graphs. A graph G is a minimum indifference graph if 
G is a reduced indifference graph and, for some indifference order of G, every 
vertex of G is the first or the last element of a maximal clique of G [6] . Given two 
distinct minimum indifference graphs, their clique graphs are also distinct [6]. 
This property is not true for general indifference graphs [7]. Note that our re- 
sults apply to minimum indifference graphs: no minimum indifference graph is 
neighbourhood-overfull, every minimum indifference graph is in Class 1, and we 
can edge-colour any minimum indifference graph with A colours. 

Third, and perhaps more important, the decomposition techniques we use to 
show these results are new and proved to be simple but powerful tools. 
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Abstract. We propose two conjectures on the chromatic polynomial 
of a graph and show their validity for several classes of graphs. Our 
conjectures are stronger than an older conjecture of Bartels and Welsh 
[!]• 

Keywords: Vertex colorings, chromatic polynomials of graphs 



The goal of this paper is to propose two conjectures on the chromatic polynomial 
of a graph and prove them for several classes of graphs. Our conjectures are 
stronger than a conjecture of Bartel and Welsh [1] that was recently proved by 
Dong [2]. 

Let G be a graph. The chromatic polynomial of G is related to the colourings 
(of the vertices) of G so that no two adjacent vertices get the same colour. If we 
denote by Ck{G) the number of ways to colour the vertices of G with exactly k 
colours, then the chromatic polynomial of G is: 

n 

P(G,A) = ^Cfc(G)(A)fc, 

k^l 

where (A)fe = (^)fc!. 

Let w(G) denote the clique number of G (maximum number of pairways 
adjacent vertices) and let x(G) denote the chromatic number of G (minimum 
number of colours used in a colouring). 

We propose the following two conjectures on P{G, A): 



Conjecture 1 



mA-1) ^ A - x(G) 

P(G,A) - A 'v A J 

Conjecture 2 

P(G,A-1) ^ A-w(G) /A-1A””“^'^^ 
P(G, A) - A A ) 



VA > n; 



VA > n. 



^ This research was performed while the author was visiting IASI. 
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(2) 
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Conjectures 1 and 2 are related to a conjecture of Bartels and Welsh [1], 
known as the Shameful Conjecture: 

The Shamefule Conjecture: For every graph G with n vertices, 

P{G, n — 1) /n — 1 
P{G,n) “ \ n 

The Shameful Conjecture was recently proved by Dong [2], who showed that 
for every connected graph G with n vertices, 




P(G,A- 1) ^ A-2 /A-2\ 
P{G, A) - A VA- 1 ) 



VA > n. 



( 3 ) 



What relates Conjectures 1 and 2 to Bartel’s and Welsh’s conjecture is that 
both are stronger than their conjecture. To show that, let first prove the following 
easy inequality: 



m — k 

< 

m 




k 



for every two integers m, k with m > k > 0, 



( 4 ). 



The validity of (4) immediately comes from the following inequality, which is 
strict if fc > 1: 

j k . k ^ 

m — k -p-r m — I ^ -p-r m — I 

m -^4 _ J _|_ 1 — CL ^ 

Now, (4) immediately implies that Conjecture 1 is stronger than Conjecture 2 
(write (4) with m = X — uj{G) and k = x(G) — w(G)). To see that Conjecture 2 
implies the Shameful Conjecture, write (2) with A = n, that is: 

P(G, n— 1) n — Lu{G) 

P{G,n) ~ n 

and apply inequality (4) with m = n and k = w(G). 

Moreover, inequality (3) is not stronger than inequalities (1) and (2): in fact, 
it is easy to show that for every graph G with n vertices and 2w(G) > n + 2, the 
right hand side of inequality (2) is smaller than the right hand side of inequality 
(3). 

The two upper bounds given in our conjectures can be considered as inter- 
polations between the respective ratios for the empty graphs 0„ (graph with n 
vertices and no edges) and the complete graphs Kn (graphs with n vertices and 
all edges), for which the conjectured bounds are clearly tight. Their strength 
allowed us to define operations on graphs that maintain the validity of the con- 
jectured bounds. In particular, we prove the validity of our conjectures for several 
classes of graphs and then use these classes of graphs as building blocks to enlarge 
the class of graphs for which our conjectures are true. 



w \ n—bj{G) 
n — 1 \ 

n ) 
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In [1] , it was introduced the concept of the mean colour number of a graph 
G with n vertices, as 



= n 



( P(G,n-l) \ 

V P{G,n) )■ 



Since the Shameful Conjecture is true, it immediately yields a bound for fr{G), 
that is 

which is tight only for the graph G = If Conjecture 2 were true, then we 
could get a better bound for /r(G), that is 



fJ,{G) > n 



_ n - u;{G) 



n — 1 
n 



n— 



which is tight when G is the disjoint union of a clique and a stable set. 

The next four theorems will give operations for building families of graphs 
which satisfy Conjecture 1 (or Conjecture 2) from the basic graph 0\. 

For this purpose, we first need to give some notations and definitions. Let G 
be a graph with n vertices, if G is a tree then we shall denote it by T„, and if G 
is a cycle then we shall denote it by G„. 

A universal vertex in a graph is a vertex which is adjacent to all other vertices. 
A clique-cutset in a graph G is a clique whose removal from G disconnects the 
graph. If G has induced subgraphs Gi and G 2 such that G = Gi U G 2 and 
Gi n G 2 = Kf (for some t ) , then we say that G arises from G\ and G 2 by clique 
identification (see Figure I). Clearly, if G arises by clique identification from two 
other graphs, then G has a clique-cutset (namely, the clique Kt). 







G 



Fig. 1. Clique cutset Kt 



A graph is chordal (or triangulated) if it contains no induced cycles other 
than triangles. It is well known that a graph is chordal if it can be constructed 
recursively by clique identifications, starting from complete graphs. 

Let uv be an edge of a graph G. By G|„^ we denote the graph obtained from 
G by contracting the edge uv into a new vertex which becomes adjacent to all 
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the former neighbours of u and v. We say that G is contractahle to a graph F 
if G contains a subgraph that becomes F after a series of edge contractions and 
edge deletions. A graph is series-parallel if it is not contractable to K 4 . 

Finally, let uv be an edge of a graph G. Subdividing the edge uv means to 
delete uv and add a new vertex x which is adjacent to only u and v. It is well 
known that a series-parallel multigraph can be constructed recursively from a 
K 2 by the operations of subdiving and of doubling edges. 

Now we are ready to prove our results. 

Theorem 1 (Disjoint union) Let FI he a graph obtained from two graphs Gi 
and G 2 by disjoint union. If both Gi and G 2 satisfy Conjecture 1 then H also 
satisfies Conjecture 1. 

Proof. : Assume that H has n vertices. Let denote the number of vertices of 
Gi {i = 1,2) and let A > n(= ni U 2 ). Assume that x(H) = x(G 2 ) > x(Gi). 
Since 

P(i7,A) =P(Gi,A)P(G2,A), 

we have 

P{H, A - 1) P(Gi, A - 1) P(G2, a - 1) 

P(P,A) " P(Gi,A) P(G2,A) ■ 

Since both Gi and G 2 satisfy Conjecture 1, we have 

P(P,A-1) ^ A-x(Gi) A-x(G 2 ) 

P(P,A) - A A ^ A y 

But (3) implies that 

A-x(Gi) 



and so we are done. 

Theorem 2 (Add a universal vertex) Let H he a graph obtained from some 
graph G by adding a universal vertex. If G satisfies Conjecture 1 then H also 
satisfies Conjecture 1. 

Proof. : Assume that H has n vertices. Write x = x{H) = x(G) -I- 1 and let 
A > n. Since P(P, A) = AP(G, A — 1), we have: 

P(P,A-1) A-1 P(G,A-2) 

P(P, A) “ A P(G,A- !)■ 

But then, since G satisfies Conjecture 1, 

^ ^ ^ /A-2A"-^ 

P(P,A) - A A-1 ^A-iy 

and so we are done because 
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Theorem 3 (Clique identification) Let H he a graph obtained from two graphs 
Gi and G 2 by clique identification. If both Gi and G 2 satisfy Conjecture 1 then 
H also satisfies Conjecture 1. 



Proof. : Set x = x{H). Without loss of generality, we can assume that x(G' 2 ) > 
x(Gi), and so x = x(G 2 ). Let Ui denote the number of vertices of Gi {i = 1,2) 

and let Gi (1 G 2 = Kt. Clearly, H has n = n\ + U 2 — t vertices. Now, let A > n. 

Since 

,, P(Gi,A)P(Gj,A) P(Gi,A)P(G2,A) 

p(p„A) - m, ' 

we have 

P{H,X-1) A P(Gi,A- 1) P(G2,A- 1) 

P{H, A) “ A^ P(Gi,A) P(G2,A) ■ 

Since both Gi and G 2 satisfy (1), we have 

P{H,X-1) ^ X A-x(Gi) A-x(G 2 ) /A- 

P{H, A) - A^ A A A ) 



that is 



< 



(X-y\ 




/A-l\ 


1 A J 


' A ' 


1 A ) 



P{H,X) X-t 

Hence, to prove the theorem, we only need show that 

A-x(Gi) fx-iy-^^^^ 



X — t 



X 



< 1 , 



that is 



A-x(Gi) 



(V) 



X — t 



Now, since x(^i) ^ (3) (with m = A and k = x(^i) ~ implies that 

x-x(G,) + t 



A ) 



> 



But 



and so we are done. 



A — x(Gi) + t ^ X — x(Gi) 
A - X-t ’ 



Theorem 4 (Edge subdivision) Let G be a graph with n vertices, let uv he an 
edge of G, let r he a positive integer, and let H he the graph obtained from G by 
deleting edge uv and by adding the new vertices x\, - ■ ■ ,Xr and connecting each 
of them to both u and v (see Figure 2). If the following two properties hold 

(a) min{dG{u),dG{v)} < , 

(b) both G and G|„„ satisfy Conjecture 2, 
then the graph H also satisfies Conjecture 2. 
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u u 





G 



H 



Fig. 2. Subdivide edge uv 



Before proving Theorem 4, we need the following technical lemma. 



Lemma 1 Let x and r he two integers with x > r > 1. Then 

{x — iy{x + 1)’’+^ — ^ X 

x[{x - lyx'' - (x + iy{x - 2Y] ^ 2' 

Proof, of Theorem 4. 

Write G" = u = uj{H), oj' = lv{G), and u" = oj{G"). Since 
P{H, A) = P{H + uv, A) + A) = (A - 2yP{G, A) + (A - A), 



we have 

P(P,A-1) /A-ay P(G,A- 1) y-2y P(G", A- 1) 
P{H,\) ~\X-2J P(G,A) P(G",A) 

where 

(A-2)’-p(G, A) 

a = 

(A - 2yP{G, A) + (A - 1)’-P(G", A) 

To prove the theorem we have to show that, for every A > n + r, 



p(p,A- 1) ^ A-w y- iy+’'”‘^ 

P(P,A) - ~ ) 

For this purpose, write 



R = 



A-3 

A-2 



5 * = 



A-2 
A- 1 



Since by assumption both G and G" satisfy Conjecture 2, we have 



\-J l\-l 



+ 5(1 -a) 



A- w" /A - 1 
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Hence, to prove the theorem we only need show that 



Ra 



X — to 




+ 5(1 



a) 



X-LU'‘ 
X — to 



A- 1 



< 1 , 



( 4 ) 



where s' = r + to' — u) and s" = r + co" — u) + 1. 

For this purpose, first note that either lo' = u or ui' = tv + 1 and that either 
to" = to or oj" = LO + 1. But, since 



A-w* / A A“ 
A- w VA- 1 j 



where oj* = u' or uj* = oj" , it follows that we only need show the validity of (4) 
in the case uj' = uj" = oj, that is 



A-3 

A-2 



A 

A- 1 



A-2 
A- 1 



A 

A- 1 



r+1 



(1 - a) < 1. 



( 5 ) 



Now, inequality (5) is equivalent to the following 



A-3 

A-2 



A-2 
A- 1 



A 

A- 1 



< 



A- 1 
A 



A-2 
A- 1 



A 

A- 1' 



Since the coefficient of a in this inequality is strictly negative, we can divide 
both sides by this term and simplify terms to get the equivalent inequality: 

^/A-2V (A - 2)’’A’'+^ - (A - l)2’-+i 
“ - \~Y~ ) (A-2)2’'A-(A-3)’'(A-l)"+i' 

Replacing the expression for a, we have 

P{G,X) ^ (A-2)’-A’^+i-(A-l)^^+i 

P{G", A) - (A - 1)[(A - 2)’-(A - 1)’' - A’-(A - 3)’'] ' ^ ’ 

Hence, in order to prove the theorem, we only need show that (6) holds. Now, 
Lemma 1 implies that 

(A - 2)’'A’-+i - (A - 1)2’'+! /A-1 

(A - 1)[(A - 2)’'(A - !)’■ - A’’(A - 3)’'] “ 2 

Hence it is sufficient to show that for every A > n + r, 

P(G,X) ^ X^ 

P(G",X) - 2 ■ 

For this purpose, consider any A colouring of the graph G" . Since G" has less 
than A vertices, this colouring can be extended to a A colouring of the graph G 
as follows: give to vertex u (respectively, v) the same colour as that given to the 
vertex in G" arising from contracting uv, and give to vertex v (respectively, u) 
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any of the X — daiu) (respectively, X — daiv)) colours not used by the neighbours 
of vertex u (respectively, v) . In other words, 

P(G, A) > P{G", A)(A - min{dG(u), dG(t’)}). 

Now, by assumption, min{dG(M), dG(z;)} < , and so, since A > n + r, 

A - mm{dG{u),dG{v)} > A - ^ > — y-. 

and we are done. The theorem follows. 

The previous four theorems give operations for building families of graphs 
which satisfy Conjecture 1 (or Conjecture 2) from the basic graph Oi. 

The following corollary follows immediately from Theorems 2 and 3: 

Corollary 1 Every chordal graph satisfies Conjecture 1. 

In particular, the empty graphs On and the trees T„ satisfy Conjecture 1. 
Moreover, Theorems 3 and 4 can be used to prove the following result: 

Theorem 5 Every series-parallel graph satisfies Conjecture 2. 

Proof. : Let iL be a series-parallel graph with m vertices. If m is small then the 
theorem is obviously true. Hence, we can assume that every series-parallel graph 
with fewer vertices than H verifies Conjecture 2. Moreover, we can assume that 
p[ has no clique-cutset: for otherwise H would arise as clique-identification from 
two other series-parallel graphs and so we could apply Theorem 3. 

Now, by definition, H comes from some other series-parallel graph El' by 
either duplicating some edge of H' or by subdividing some edge of H' . Since in 
the first case H will still verify Conjecture 2, we only need show the validity of 
the theorem when H is constructed from H' by subdividing some edge uv of H' . 
Let X be the unique vertex of H that is not a vertex of H' . Set 

T={y& V(H') : dn'^y) = 2,yu€ E{H'),yv G E{H')}. 

Write T = {x\, ■ ■ ■ ,Xr-i}, with r > 1. Let G denote the graph obtained from 
H' by removing all vertices in T. It follows that H can be built from G by 
subdividing edge uv with the r vertices xi, • • • , Xr-i,Xr with Xr = x, as shown 
in Figure 2. Clearly, G is also series-parallel, and so it verifies Conjecture 2. Let 
n denote the number of vertices of G. Note that H has n -I- r vertices. Since the 
graph G\nv is Eilso series-parallel, we can apply Theorem 4. Hence, to prove the 
theorem, we only need show that 

min{fiG(u),c;G(^^)} < 

For this purpose, set 

A = {yG V{G) : yu i E{G),yv i E{G)} 

B = {y&V{G)-.yu& E{G),yv G E{G)} 

C=V{G) - {AUBLI{u,v}). 
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Now, if B contains at most one vertex, then dciu) + dciv) < n + 1 and we are 
done. Hence we can assume that B contains at least two vertices. Clearly, B is 
a stable set in G' (for otherwise, G would contain a K4). 

First, note that: 

• no vertex in C is adjacent to some vertex in B. 

To see this, assume the contrary: there exists some vertex z G G which is adjacent 
to some vertex y G B. Without loss of generality, we can assume that zu G E{G). 
Since H has no clique-cutset, it follows that {u, y} is not a clique-cutset in G, and 
so there must exists a path P in G — {u, y} joining z to v. But then contracting 
all edges of P— {u}, we get a K4, contradicting the assumption that G is series- 
parallel. 

Next, note that 

• every vertex in A is adjacent to at most one vertex in B. 

This is obviously true because G is not contractable to K4. 

Since by assumption H and hence G has no clique cutset, every vertex in B is 
adjacent to some vertex in AuG, it follows that |H| > \B\ (recall that no vertex 
in B is adjacent to some vertex in G), and so n = 2-|-|H|-|-|P|-l-|G| > 2-|-2|P|-|-G. 
But then dciu) + dciv) = \G\ + 2|P| -|- 2 < n, and so min{d(3(M), d(3(u)} < 
and we are done. 
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Abstract. A skew partition as defined by Chvatal is a partition of the 
vertex set of a graph into four nonempty parts A, B,C,D such that 
there are all possible edges between A and B, and no edges between C 
and D. We present a polynomial-time algorithm for testing whether a 
graph admits a skew partition. Our algorithm solves the more general 
list skew partition problem, where the input contains, for each vertex, 
a list containing some of the labels A,B,C,D of the four parts. Our 
polynomial-time algorithm settles the complexity of the original partition 
problem proposed by Chvatal, and answers a recent question of Feder, 
Hell, Klein and Motwani. 



1 Introduction 

A skew partition is a partition of the vertex set of a graph into four nonempty 
parts A,B,C,D such that there are all possible edges between A and B, and 
no edges between C and D. We present a polynomial-time algorithm for testing 
whether a graph admits a skew partition, as well as for the more general list skew 
partition problem, where the input contains, for each vertex, a list containing 
some of the four parts. 

Many combinatorial problems can be described as finding a partition of the 
vertices of a given graph into subsets satisfying certain properties internally 
(some parts may be required to be independent, or sparse in some other sense, 
others may conversely be required to be complete or dense), and externally (some 
pairs of parts may be required to be completely nonadjacent, others completely 
adjacent). In [10], Feder et al. defined a parameterized family of graph problems 
of this type. 

The basic family of problems they considered is as follows: partition the 
vertex set of a graph into k parts Ai, A 2 , ■ ■ ■ , with a fixed “pattern” of re- 
quirements as to which Ai are independent or complete and which pairs Ai,Aj 

* Research partially supported by CNPq, MCT/FINEP PRONEX Project 107/97, 
CAPES (Brazil) /COFECUB (France) Project 213/97, FAPERJ, and by FAPESP 
Proc. 96/04505-2. 
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are completely nonadjacent or completely adjacent. These requirements may be 
conveniently encoded by a symmetric kxk matrix M in which the diagonal entry 
Mi^i is 0 if Ai is required to be independent, 2 if Ai is required to be a clique, 
and 1 otherwise (no restriction). Similarly, the off-diagonal entry Mij is 0, 1, or 
2, if Ai and Aj are required to be completely nonadjacent, have arbitrary con- 
nections, or are required to be completely adjacent, respectively. Following [10], 
we call such a partition an M -partition. 

Many combinatorial problems just ask for an M-partition. For instance a 
fc-coloring is an M-partition where M is the adjacency matrix of the complete k- 
graph, and, more generally, iJ-coloring (homomorphism to a fixed graph H [13]) 
is an M-partition where M is the adjacency matrix of H. It is known that 
M-coloring is polynomial-time solvable when H is bipartite and NP-complete 
otherwise [13]. When M is the adjacency matrix of H plus twice the identity 
matrix (all diagonal elements are 2), then M-partitions reduce to the so-called 
(M, iF)-partitions which were studied by MacGillivray and Yu [15]. When H is 
triangle-free then (H, iF)-partition is polynomial-time solvable, otherwise it is 
NP-complete. 

Other well-known problems ask for M-partitions in which all parts are re- 
stricted to be nonempty (e.g., skew partitions, clique cutsets, stable cutsets). In 
yet other problems there are additional constraints, such as those in the defini- 
tion of a homogeneous set (requiring one of the parts to have at least 2 and at 
most n — I vertices). For instance, Winkler asked for the complexity of deciding 
the existence of an M-partition, where M has the rows 1101,1110,0111, and 
1011, such that all parts are nonempty and there is at least one edge between 
parts A and B, B and C, C and D, and D and A. This has recently been shown 
NP-complete by Vikas [17]. 

The most convenient way to express these additional constraints turns out 
to be to allow specifying for each vertex (as part of the input) a “list” of parts in 
which the vertex is allowed to be. Specifically, the list- M -partition problem asks 
for an M-partition of the input graph in which each vertex is placed in a part 
which is in its list. Both the basic M-partition problem (“Does the input graph 
admit an M-partition?”), and the problem of existence of an M-partition with 
all parts nonempty, admit polynomial-time reductions to the list-M-partition 
problem, as do all of the above problems with the “additional” constraints. List 
partitions generalize list-colorings, which have proved very fruitful in the study 
of graph colorings [1,12]. They also generalize list-homomorphisms which were 
studied earlier [7,8,9]. 

Feder et al. [10] were the first to introduce and investigate the list version of 
these problems. It turned out to be a useful generalization, since list problems 
recurse more conveniently. This enabled them to classify the complexity (as 
polynomial-time solvable or A^P-complete) of list-M-partition problems for all 
3x3 matrices M and some 4x4 matrices M. For other 4x4 matrices M they 
were able to produce sub-exponential algorithms - including one for the skew 
partition problem described below. This was the first sub-exponential algorithm 
for the problem, and an indication that the problem is not likely to be NP- 
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complete. We were motivated by their approach, and show that in fact one can 
use the mechanism of list partitions to give a polynomial-time algorithm for the 
problem. 

A skew partition is an M-partition, where M has the rows 1211,2111, 1110, 
and 1101, such that all parts are nonempty. List Skew Partition (LSP) is simply 
the list-M-partition problem for this M. We can solve skew partition by solving 
at most LSP problems such that G for 1 < t < 4, for all possible 
quadruples {ui, U2, U3, W4} of vertices of the input graph. 

The skew partition problem has interesting links to perfect graphs, and is 
one of the main problems in the area. Before presenting our two algorithms we 
discuss perfect graphs and their link to skew partition. 



2 Skew Cutsets and the Strong Perfect Graph Conjecture 

A graph is perfect if each induced subgraph admits a vertex colouring and a 
clique of the same size. A graph is minimal imperfect if it is not perfect but all 
of its proper induced subgraphs are perfect. Perfect graphs were first defined by 
Berge [2] who was interested in finding a good characterization of such graphs. 
He proposed the strong perfect graph conjecture: the only minimal imperfect 
graphs are the odd chordless cycles of length at least five and their complements. 
Since then researchers have enumerated a list of properties of minimal imperfect 
graphs. The strong perfect graph conjecture remains open and is considered a 
central problem in computational complexity, combinatorial optimization, and 
graph theory. 

Chvatal [4] proved that no minimal imperfect graph contains a structure 
that he called a star cutset: a vertex cutset consisting of a vertex and some of 
its neighbours. Chvatal exhibited a polynomial-time recognition algorithm for 
graphs with a star cutset. He also conjectured that no minimal imperfect graph 
contains a skew partition. Recalling our earlier definition, a skew partition is a 
partition of the vertex set of a graph into four nonempty parts A, B, C, D such 
that there are all possible edges between A and B, and no edges between C and 
D. We call each of the four nonempty parts A, B,C,D a skew partition set. We 
say that A U H is a skew cutset. The complexity of testing for the existence of 
a skew cutset has motivated many publications [5,10,14,16]. Recently, Feder 
et al. [10] described a quasi-polynomial algorithm for testing whether a graph 
admits a skew partition, which strongly suggested that this problem was not 
NP-complete. In this paper, we present a polynomial-time recognition algorithm 
for testing whether a graph admits a skew partition. 

Cornuejols and Reed [5] proved that no minimal imperfect graph contains a 
skew partition in which A and B are both stable sets. Actually, they proved the 
following more general result. Let a complete multi-partite graph be one whose 
vertex set can be partitioned into stable sets Si,. .. ,Sk, such that there are all 
possible edges between Si and Sj, for i ^ j . They proved that no minimal imper- 
fect graph contains a skew cutset that induces a complete multi-partite graph. 
Their work raised questions about the complexity of testing for the existence 
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either of a complete bipartite cutset or of a complete multi-partite cutset in a 
graph. 

Subsequently, Klein and de Figueiredo [16] showed how to use a result of 
Chvatal [3] on matching cutsets in order to establish the NP-completeness of 
recognizing graphs with a stable cutset. In addition, they established the NP- 
completeness of recognizing graphs with a complete multi-partite cutset. In par- 
ticular, their proof showed that it is NP-complete to test for the existence of a 
complete bipartite cutset, even if the cutset induces a Ki^p. 

As shown by Chvatal [4], to test for the existence of a star cutset is in P, 
whereas to test for the existence of the special star cutset Ki p is NP-complete, 
as shown in [16]. The polynomial-time algorithm described in this paper offers 
an analogous complexity situation: to test for the existence of a skew cutset 
is in P, whereas to test for the existence of a complete bipartite cutset is NP- 
complete [16]. 

3 Overview 

The goal of this paper is to present a polynomial-time algorithm for the following 
decision problem: 

Skew Partition Problem 
Input: a graph G = (P, E). 

Question: Is there a skew partition A, B, C, D of G? 

We actually consider list skew partition (LSP) problems, stated as decision 
problems as follows: 

List Skew Partition Problem 

Input: a graph G = (V,E) and for each vertex v GV, a, subset Ly of {A, B,C, D}. 
Question: Is there a skew partition A, B, C,D of G such that each v is contained 
in some element of the corresponding Lyl 

Throughout the algorithm we have a partition of V into at most 15 sets 
indexed by the nonempty subsets of {A, B,G, D}, i.e., {Sl\L C {A, i?, G, Z?}}, 
such that Property 1 is always satisfied. For convenience, we denote *S'{a} by Sa- 
Note that the relevant inputs for LSP have Sa, Sb, Sc, and So nonempty. 

Property 1. If the algorithm returns a skew partition, then if u is in S'/,, then 
the returned skew partition set containing v is in L. 

Initially, we set Sl = {v\Ly = L}. 

We also restrict our attention to LSP instances which satisfy the following 
property: 

Property 2. If u G Sl, for some L with A € L, then it sees every vertex of Sb- 
If u G Sl, for some L with B £ L, then it sees every vertex of Sa- If u G Sl, for 
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some L with C € L, then it is non-adjacent to every vertex of So- v € Sl, for 
some L with D G L, then it is non-adjacent to every vertex of Sc- 

Both Properties 1 and 2 hold throughout the algorithm. 

Remark 1. Since Sb must be contained in we know that if v is to be in A for 
some solution to the problem, then v must see all oi Sb- Thus if some v G Sa 
misses a vertex oi Sb, then there is no solution to the problem and we need not 
continue. If there is some L with A properly contained in L and a vertex v in 
Sl which misses a vertex of Sb, then we know in any solution to the problem v 
must be contained in some element of L \ 7l. So we can reduce to a new problem 
where we replace S'l by S'l\w, we replace Sb\a by Sl\a + v and all other Sl are 
as before. Such a reduction reduces by 1- Since this sum is at most 

An, after 0{n) similar reductions we must obtain an LSP problem satisfying 
Property 2 (or halt because the original problem has no solution). 

In our discussion we often create new LSP instances and whenever we do 
so, we always perform this procedure to reduce to an LSP problem satisfying 
Property 2. 

For an instance / of LSP we have {Sl{I)\L C {A, B, C, D}, but we drop the 
(/) when it is not needed for clarity. 

We will consider a number of restricted versions of the LSP problems: 

— MAX-3-LSP: an LSP problem satisfying Property 2 such that Sabcd = 0; 

— MAX-2-LSP: an LSP problem satisfying Property 2 such that if \L\ > 2, 
then Sl = 0; 

— AC-TRIV LSP: an LSP problem satisfying Property 2 such that Sac = 0; 

Remark 2- It is easy to obtain a solution to an instance of AC-TRIV-LSP 
as follows: A = Sa, ^ = [J ‘S'l, C = Sc, and D ~ Sl- By 

BeZ, DGL,B^L 

Property 2 this is indeed a skew partition. 

— BD-TRIV LSP, AD-TRIV LSP, BC-TRIV-LSP. These problems are defined 
and solved similarly as AC-TRIV LSP. 

Our algorithm for solving LSP requires four subalgorithms which replace an 
instance of LSP by a polynomial number of instances of more restricted versions 
of LSP. 

Algorithm 1 Takes an instance of LSP and returns in polynomial time a list 
C of a polynomial number of instances of MAX-3-LSP such that 

(i) a solution to any problem in L is a solution of the original problem, and 

(ii) if none of the problems in C have a solution, then the original problem has 
no solution- 
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Algorithm 2 Takes an instance of MAX-3-LSP and returns in polynomial time 
a list C of a polynomial number of instances of MAX-3-LSP such that: 

(i) and (ii) of Algorithm 1 hold, and 

(Hi) for each problem in L, either Sabc = 0 or Sabd = 0- 

Algorithm 3 Takes an instance of MAX-3-LSP and returns in polynomial time 
a list C of a polynomial number of instances of MAX-3-LSP such that: 

(i) and (ii) of Algorithm 1 hold, and 

(Hi) for each problem in L, either Sbcd = 0 or Sacd = 0- 

Algorithm 4 Takes an instance of MAX-3-LSP such that 

(a) either Sabc or Sabd is empty, and 

(b) either Sbcd or Sacd is empty 

and returns a list C of a polynomial number of problems each 
instance of one of MAX-2-LSP, AC-TRIV LSP, AD-TRIV LSP, 
or BD-TRIV LSP such that (i) and (ii) of Algorithm 1 hold. 

We also need two more algorithms for dealing with the most 
of LSP. 

Algorithm 5 Takes an instance of MAX-2-LSP and returns either 

(i) a solution to this instance of MAX-2-LSP, or 

(ii) the information that this problem instance has no solution. 



of which is an 
BC-TRIV LSP 

basic instances 



Remark 3. Algorithm 5 simply applies 2-SAT as discussed in [6]; we omit the 
details. 



Algorithm 6 Takes an instance of AC-TRLV LSP or AD-TRLV LSP or BC- 
TRLV LSP or BD-TRLV LSP and returns a solution using the partitions dis- 
cussed in the Remark 2. 

To solve an instance of LSP we first apply Algorithm 1 to obtain a list 
Li of instances of MAX-3-LSP. For each problem instance / on Li, we apply 
Algorithm 2 and let Lj be the output list of problem /. We let L 2 be the 
concatenation of the lists {Lj\L G Li}. For each / in L 2 , we apply Algorithm 
3. Let L3 be the concatenation of the lists {Lj\L G L2}. For each problem 
instance / on L3, we apply Algorithm 4. Let L4 be the concatenation of the 
lists {Lj\I G L3}. Each element of L4 can be solved using either Algorithm 5 or 
Algorithm 6 in polynomial time. If any of these problems has a solution S, then 
by the specifications of the algorithms, S' is a solution to the original problem. 
Otherwise, by the specifications of the algorithms, there is no solution to the 
original problem. Clearly, the whole algorithm runs in polynomial time. 
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4 Algorithm 1 

We now present Algorithm 1. The other algorithms are similar and their details 
are left for a longer version of this paper. The full details and proofs are in the 
technical report [11]. 

Algorithm 1 recursively applies Procedure 1 which runs in polynomial time. 
Procedure 1 Input.' An instance I of LSP. 

Output.' Four instances Ii, I 2 , 13 , h of LSP such that, for 1 < j < 4, we have 
\SABCD{Ij)\ < ^\Sabcd(I)\- 

It is easy to prove inductively that recursively applying Procedure 1 yields a 
polynomial time implementation of Algorithm 1 which when applied to an input 
graph with n vertices creates as output a list £ of instances of LSP such that 

I Cl ^ ", 14 

|£| < 4 9 < n . 

Let n = \Sabcd{I)\- For any skew partition {A, B,C, D}, let A' = AC\ 
Sabcd{I), B' = Bn Sabcd{I), C' = C C\ Sabcd{I), and D' = Dn Sabcd{I)- 

Case 1: There exists a vertex v in Sabcd such that ^ < |S'abcdOA^(w)| < 
Branch according to whether v G A, v G B, v G C, or v G D with instances 
Ia,Ib,Ic,Id respectively. We define I a by initially setting Sa{Ia) = u + Sa{I) 
and reducing so that Property 2 holds. We define Ib,Ic,Id similarly. We note 
that by Property 2,ifvG C, then DnN{v) = 0. So, Sabcd(Ic) C Sabcd(I) \ 
N{v). 

Because there are at least vertices in Sabcd O N{v), this means that 
\Sabcd{Ic)\ < 

Symmetrically, \Sabcd{Id)\ < ff- 

Similarly, by Property 2, Sabcd{Ia) C SABCD{I)r\N{v), so \Sabcd{Ia)\ < 
Symmetrically \Sabcd{Ib) \ < • □ 

Case 2: There are at least ^ vertices v in Sabcd such that |5'^Bcr)OiV(u)| < 
and there are at least vertices v in Sabcd such that \Sabcd O N{v)\ > 

Let W = {v G Sabcd ■ \Sabcd O N{v)\ > and X = {v G Sabcd ■ 
\Sabcd n N{v)\ < ^}. Branch according to |A'| > or \B'\ > or 
\C'\ > or |D'| > with corresponding instances IacIbcIc and Id'- Each 
of these choices forces either all the vertices in W or all the vertices in X to 
have smaller label sets, as follows. If |A'| > then every vertex in B has 
neighbours in Sabcd{I), so B n A = 0. Thus, Sabcd{Ia') = Sabcd{I) \ X, 
and \Sabcd{Ia')\ < f§- If |S'| > ^, then a symmetrical argument shows that 
A n A = 0. Thus, Sabcd(Ib') = Sabcd{I) \ A, and \Sabcd{Ib’)\ < f[j- If 
|C^| > then every vertex in D has at least ^ non-neighbours in Sabcd{I)- 
Hence IF n L> = 0, Sabcd{Ic) = Sabcd{I) \ IF, and so \Sabcd{Ic')\ < f§- 
If |D'| > then a symmetrical argument shows that \Sabcd{Id')\ < f§- LI 

Case 3: There are over ^ vertices in IF. 
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We will repeatedly apply the following procedure to various W C W with 
\W\ > We recursively define a partition of W into three sets O, T, and NT 
such that: 

— There are all edges between O and T; 

— For every w in NT, there exists u in O such that w misses v, 

— The complement of O is connected. 

Start by choosing v\ in W and setting: O = {ui}, T = N{vi)r\W', and NT = 

W \ {N{vi) U {r'l}). Note that for each vertex v of W , since v misses at most ^ 

vertices of Sabcd, |iV(t-)nW'| > \W'\-j^. So \NT\ = |W'\(A^(m)U{t>i})| < 
Grow O by moving an arbitrary vertex v from NT to O, and by moving T\N(v) 
from T to NT until: 

(i) |0| + |iVT| > or 

(ii) iVT = 0. 

If the growing process stops with condition (i), i.e., \0\ + |iVT| > and 
Vi was the last vertex added to O, then adding vi to O increased |iVr| by 
at most \W' \ (N{vi) U {i;*})! < Thus, \0\ + \NT\ < + ^ = |. So, 

IT^I "> §21 — §21 ■> 22 - 

\ — 10 5 ~ 10 — 10 ' 

Our first application of the procedure is with W' = W. If we stop because 
(i) holds, then we define four new instances of LSP according to the intersection 
of the skew partition sets A, B,C or D with O, as follows: 

(a) /i : C n O ^ 0, 

(b) l2 : C n O = 0, n O yf 0, 

(c) h-.OCA, 

(d) h-.OCB. 

Recall that the complement of O is connected, which implies that if O 0 
(Cun) = 0, then either O C A or O C B. If O C A, then NT Ci B = % since 
(Vw € NT){3v € O) such that vw ^ E. Thus, (OU A^T)nS'^BCi)(-f 3 ) = 0- Hence 
\Sabcd{I 3 )\ < a symmetrical argument shows that \Sabcd{I 4 )\ < f(j- 
If C n O yf 0, then 0 T = 0. Thus, T 0 Sabcd{Ii) = 0, which implies 
\Sabcd{Ii)\ < f§- a symmetrical argument shows that \Sabcd{I2)\ < f§- 
Thus, if our application of the above procedure halts with output (i), then we 
have found the four desired output instances of LSP. 

Otherwise, the growing process stops with condition (ii), i.e., NT = 0 and 
|0| < Set Cl = O and reapply the algorithm to W = IF \ Oi to obtain 
O 2 . More generally, having constructed disjoint sets 0\, . . . ,Oi with < 

n/10, we construct by applying the algorithm to IFj = IF \ Note 

> t- 

We continue until |Uy^^Oy| > or condition (i) occurs. If condition (i) ever 
occurs, then we proceed as above. Otherwise, we stop after some iteration i* 
such that I Oi\ < £^nd | Ui<i» Oi\ > Since \Oi* \ < we have that 




Finding Skew Partitions Efficiently 



171 



I Ui<i* Oi\ < ^ . Also, all the edges between sets Z = and Y = W\U^j^iOj 

exist, which implies that CnZ' = 0or_Dny = 0. 

We now define two new instances of LSP according to the intersection of 
skew partition sets C or D with Z, as follows: 

(a) h:CnZ = (h, 

(b) h-. Df^Y = %. 

In either output instance A, \SABCD{Ii)\ < • D 

Note that the case |X| > is symmetric to Case 3 (consider G) and is 
omitted. 

5 Concluding Remarks 

It is evident to the authors that the techniques we have developed will apply to 
large classes of list-M-partition problems. We intend to address this in future 
work. 
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Abstract. Given a set of say m stocks (one of which may be “cash”), the online 
portfolio selection problem is to determine a portfolio for the trading period 
based on the sequence of prices for the preceding i — 1 trading periods. Com- 
petitive analysis is based on a worst case perspective and such a perspective is 
inconsistent with the more widely accepted analyses and theories based on dis- 
tributional assumptions. The competitive framework does (perhaps surprisingly) 
permit non trivial upper bounds on relative performance against CBAL-OPT, an op- 
timal offline constant rebalancing portfolio. Perhaps more impressive are some 
preliminary experimental results showing that certain algorithms that enjoy “re- 
spectable” competitive (i.e. worst case) performance also seem to perform quite 
well on historical sequences of data. These algorithms and the emerging competi- 
tive theory are directly related to studies in information theory and computational 
learning theory and indeed some of these algorithms have been pioneered within 
the information theory and computational learning communities. We present a 
mixture of both theoretical and experimental results, including a more detalied 
study of the performance of existing and new algorithms with respect to a stan- 
dard sequence of historical data cited in many studies. We also present experiments 
from two other historical data sequences. 



1 Introduction 

This paper is concerned with the portfolio selection (PS) problem, defined as follows. 
Assume a market with m securities. The securities can be stocks, bonds, currencies, 
commodities, etc. For each trading day i > 0, let = (w.iji’i,!, ■ • ■ , W,m) be the 
price vector for the ith period, where Vij, the price or value of the jth security, is 
given in the “local” currency, called here cash or dollars. For analysis it is often more 
convenient to work with relative prices rather than prices. Define Xi^ = Vi^j/vi-ij 
to be the relative price of the jth security corresponding to the ith period.' Denote by 

* Here we are greatly simplifying the nature of the market and assuming that Xi+ij is the 
ratio of opening price on the i + 1“* day to the opening price on the day. That is, we are 
assuming that a trader can buy or sell at the opening price. Later we try to compensate for this 
by incorporating bid-ask spreads into transaction costs. 
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Xi = , Xi^m) the market vector of relative prices corresponding to the zth day. 

A portfolio b is specified by the proportions of current dollar wealth invested in each of 
the securities 

b — (6i, . . . , bjYif bj ^ 1, 'y ^ bj — 1 . 

The return of a portfolio b w.r.t. a market vector x is b • x = bjXj . The (compound) 
return of a sequence of portfolios B = bi, . . . ,b„ w.r.t. a market sequence X = 
xi, . . . ,x„ is 

n 

R{B,X) = Y[h,-^, . 

i=l 

A PS algorithm is any deterministic or randomized rule for specifying a sequence of 
portfolios. If ALG is a deterministic (respectively, randomized) PS algorithm then its 
(expected) return with respect to a market sequence X is denoted by alg(A). 

The basic PS problem described here ignores several important factors such as trans- 
action commissions, buy-sell spreads and risk tolerance and control. 

A simple strategy, advocated by many financial advisors, is to simply divide up the 
amount of cash available and to buy and hold a portfolio of the securities. This has the 
advantage of minimizing transaction costs and takes advantage of the natural tendency 
for the market to grow. In addition, there is a classical algorithm, due to Markowitz 
[Mar59], for choosing the weightings of the portfolio so as to minimize the variance for 
any target expected return. 

An alternative approach to portfolio management is to attempt to take advantage of 
volatility (exhibited in price fluctuations) and to actively trade on a “day by day” basis. 
Such trading can sometimes lead to returns that dramatically outperform the performance 
of the best security. 

For example, consider the class of constant rebalanced algorithms. An algorithm in 
this class, denoted CBALb is specified by a fixed portfolio b and maintains a constant 
weighting (by value) amongst the securities. Thus, at the beginning of each trading 
period CBALb rebalances its portfolio so that it is b-balanced. The constant rebalanced 
algorithms are motivated by several factors. In particular, it can be shown that the optimal 
offline algorithm in this class, cbal-opt, can lead to exponential returns that dramatically 
outperform the best stock (see e.g. [Cov91]). 

One objective in studying the portfolio selection problem is to arrive at online trading 
strategies that are guaranteed, in some sense, to perform well. What is the choice of 
performance measure? We focus on a competitive analysis framework whereby the 
performance of an online portfolio selection strategy is compared to that of a benchmark 
algorithm on every input. One reasonable benchmark is the return provided by the best 
stock. For more active strategies, an optimal algorithm (that has complete knowledge 
of the future) could provide returns so extreme that any reasonable approach is doomed 
when viewed in comparison. Specifically, any online algorithm competing against the 
optimal offline algorithm, called opt is at best m” -competitive where n is the number 
of trading days and m is the number of stocks. 

In the more classical (and perhaps most influential) approach, the PS problem is 
broken down into two stages. First, one uses statistical assumptions and historical data 
to create a model of stock prices. After this the model is used to predict future price 
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movements. The technical difficulties encountered in the more traditional approach (i.e. 
formulating a realistic yet tractable statistical model) motivates competitive analysis. 
The competitive analysis approach starts with minimal assumptions, derives algorithms 
within this worst case perspective and then perhaps adds statistical or distributional 
assumptions as necessary (to obtain analytical results and/or to suggest heuristic im- 
provements to the initial algorithms). It may seem unlikely that this approach would be 
fruitful but some interesting results have been proven. In particular, Cover’s Universal 
Portfolio algorithm [Cov91] was proven to possess important theoretical properties. We 
are mainly concerned with competitive analyses against cbal-opt and say that a portfo- 
lio selection algorithm alg is c-competitive (w.r.t. cbal-opt) if the supremum, over all 
market sequences X, of the ratio cbal-opt(X) /alg(X) is less than or equal c. 

Instead of looking at alg’s competitive ratio we can equivalently measure the degree 
of “universality” of alg. Following Cover [Cov91], we say that alg is universal if for 
allX, 

logCBAL-OPr(X) logALG(X) ^ 

n n 

In this sense, the “regret” experienced by an investor that uses a universal online algorithm 
approaches zero as time goes on. Clearly the rate at which this regret approaches zero 
corresponds to the competitive ratio and alg is universal if and only if its competitive ratio 
is . One motivation for measuring performance by universality is that it corresponds 
to the minimization of the regret, using a logarithmic utility function (see [BE98]). On 
the other hand, it obscures the convergence rate and therefore we prefer to use the 
competitive ratio. When the competitive ratio of a PS algorithm (against cbal-opt) can 
be bounded by a polynomial in n (for a fixed number of stocks), we shall say that the 
algorithm is competitive. 



2 Some Classes of PS Algorithms 

2.1 Buy-And-Hold (BAH) Algorithms 

The simplest portfolio selection policy is buy-and-hold (bah): Invest in a particular 
portfolio and let it sit for the entire duration of the investment. Then, in the end, cash 
the portfolio out of the market. The optimal offline algorithm, bah-opt, invests in the 
best performing stock for the relevant period. Most investors would probably consider 
themselves to be very successful if they were able to achieve the return of bah-opt. 

2.2 Constant-Rebalanced (CBAL) Algorithms 

The constant-rebalanced (cbal) algorithm CBALb has an associated fixed portfolio b = 
(&i, . . . , bm) and operates as follows: at the beginning of each trading period it makes 
trades so as to rebalance its portfolio to b (that is, a fraction bi is invested in the ith 
stock, i = 1, . . . , to). It is easy to see that the return of cbal-opt is bounded from below 
by the return of bah-opt since every bah strategy can be thought of as an extremal cbal 
algorithm. It has been empirically shown that in real market sequences, the return of 
CBAL-OPT can dramatically outperform the best stock (see e.g. Table 3). 
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Example 1 ( Cover and Gluss [CG86 ], Cover [Cov91 ] ). Consider the case m = 2 with 
one stock and cash. Consider the market sequence 




CBAL(1_1)(A:) = 



Thus, for this market, the return of cbaLj- i i ^ is exponential in n while the best stock is 
moving nowhere. 

Under the assumption that the daily market vectors are identically and independently 
and identically distributed (i.i.d), there is yet another motivating reason for considering 
CBAL algorithms. 

Theorem 1 (Cover and Thomas [CT91]). Let X = xi, . . . be i.i.d. according to 
some distribution F(x). Then, for some b, CBALb performs at least as good ( in the sense 
of expected return) as the best online PS algorithm. 

Theorem 1 tells us that it is sufficient to look for our best online algorithm in the set 
of CBAL algorithms, provided that the market is generated by an i.i.d. source. One should 
keep in mind, however, that the i.i.d. assumption is hard to justify. (See [Gre72,BCK92] 
for alternative theories.) 






-|«/2 



'3 3 
4 ’ 2 



1 n/2 



7/2 



2.3 Switching Sequences (Extremal Algorithms) 

Consider any sequence of stock indices, 

jiG m} . 

This sequence prescribes a portfolio management policy that switches its entire wealth 
from stock to stock ji+i. An algorithm for generating such sequences can be deter- 
ministic or randomized. Ordentlich and Cover [OC96] introduce switching sequence 
algorithms (called extremal strategies). As investment strategies, switching sequences 
may seem to be very speculative but from a theoretical perspective they are well moti- 
vated. In particular, for any market sequence, the true optimal algorithm (called opt) is 
a switching sequence^ . 

Example 2. Consider the following market sequence X : 




Starting with $1, the switching sequence 2, 2, 2, 2, 1, 1, 1, 1 returns $256. In contrast, 
any (mixture of) buy and holds will go bankrupt on X returning $0 and for all b, 

CBALb(AT) < CBAL(l l)(AT) = CBAL-OPx(X) = $1. 

^ Here we are assuming either no transaction costs or a simple transaction cost model such as a 
fixed percentage commission. 
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3 Some Basic Properties 

In this section we derive some basic properties concerning PS algorithms. 

3.1 Kelly Sequences 

A Kelly (or a “horse race”) market vector for m stocks is a vector of the form 

(0,0, 1,0, 0,0, 0,0, 0,0, 0,0) . 

^ ’V 

m 

That is, except for one stock that retains its value, all stocks crash. 

Let K.^ be the set of all length n Kelly market sequences over m stocks. (When 
either m and/or n is clear from the context it will be omitted.) There are m” Kelly 
market sequences of length n. Kelly sequences were introduced by Kelly [Kel56] to 
model horse race gambling. We can use them to derive lower bounds for online PS 
algorithms (see e.g. [C096] and Lemma 6 below). 

The following simple but very useful lemma is due to Ordentlich and Cover [OC96] . 

Lemma 1. Let alg be any online algorithm. Then 

alg(AT) = 1 . 

Proof. We sketch the proof for the case m = 2. The proof is by induction on n. For the 
base case, n = 1, notice that alg must specify its first portfolio, (6, 1 — b), before the 
(Kelly) market sequence (in this case, of length one) is presented. The two possible Kelly 
sequences are Ffi = (°) and 7^2 = (q). Therefore, alg(ATi)+alg(A' 2 ) = {l — b) + b = 
1. The induction hypothesis states that the lemma holds for n — 1 days. The proof is 
complete when we add up the two returns corresponding to the two possible Kelly vectors 
for the first day. 

□ 

Lemma 1 permits us to relate cbals to online switching sequences as shown in the next 
two lemmas. 

Lemma 2. Let alg be any PS algorithm. There exists an algorithm alg', which is a 
mixture of switching sequences, that is equivalent {in terms of expected return) to alg, 
over Kelly market sequences. 

Proof. Fix n. By Lemma 1 we have alg(AT^) = 1 with alg(AT^) > 0 for 

all K( G /C". Therefore, {alg(AT^)}^ is a probability distribution. For a sequence of 
Kelly market vectors K = ki,k 2 , . ■ . , kn, denote by Sk = s{ki), 5 (^ 2 ), . . . , s(fc„) the 
switching sequence where s{ki) is the index of the stock with relative price one in 
Let alg' = eiC" ' ^Kt be the mixture of switching sequences that assigns 

a weight avc{Ki) to the sequence Ski - Clearly, for each Kelly market sequence K we 
have ALG'(iT) = alg(AT). 
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Lemma 2 can be extended in the following way. 

Lemma 3. If alg = CBALb is a constant- rebalanced algorithm then (for known n) we 
can achieve the same return as CBALb on any market sequence using the same method. 
That is, use the mixture alg' = CBALb(iL^) • Ski- 

Proof To prove this, consider any b = (6i, . . . , hm)- The return of CBALb over a Kelly 
market sequence Kg = . . . ,k^is CBALb(iT^) = Hi ^s{k‘) which gives the weight of 

Sk^ in alg'. Therefore, for an arbitrary market sequence we have 

alg'(X) = E 

i 

n m 

= n E 

i=i i=i 

= I]^b • Xi = CBALb(X). 



Theorem 2. (i) A (Mixture of) cbal algorithms can emulate a (mixture) o/bah algo- 
rithms. (ii) A (mixture of) ss algorithms can emulate a cbal algorithm and hence any 
mixture o/cbal5. 



Lemma 4. The competitive ratio is invariant under scaling of the relative prices. In 
particular, it is invariant under scaling of each day independently Thus we can assume 
without loss of generality that all relative prices are in [0, 1] 



Lemma 5. In the game against opt it is sufficient to prove lower bounds using only 
Kelly market sequences. 

Proof. Let X = xi, . . . , x„ be an arbitrary market sequence. Using Lemma 4 we can 
scale each day independently so that in each market vector x^ = xn . . . ,Xim, the 
maximum relative price of each day (say it is equals 1. Now we consider the 

“Kelly projection” X' of the market AT; that is, in the market vector x', x' = 1 and 
Xix = 0 for f ^ imax- Fof ^ny algorithm alg, we have alg(X) > ALG(Jt'), but opt 
always satisfies (when there are no commissions) opr(Jf) = opr(Jf'). □ 

3.2 Lower Bound Proof Technique 

Using Lemma 1 we can now describe a general lower bound proof technique due to 
Cover and Ordentlich. 

Lemma 6 (Portfolio Selection Lower Bound). Let optc be an optimal offline algorithm 
from a class C of offline algorithms (e.g. cbal-opt when C is the class of constant rebal- 
anced portfolios). Then = Xice/C” oPTc(iL) is a lower bound on the competitive 
ratio of any online algorithm relative to optc. 
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Proof. We use Lemma 1 . Clearly, the maximum element in any set is not smaller than 
any weighted average of all the elements in the set. Let Qk = = ALG(iL). 

We then have 



max 

KdK.^ 



OPTc(jL) 

ALG(iC) 



- X! ' 

ATeK" 



OPTc(jL) 

ALG(iC) 



= alc{K)- 



OPTc(J^) 

ALG(iL) 



= Y optc{K) . 

ATe/C" 



□ 



4 Optimal Bounds against BAH -OPT and OPT 

In some cases, Lemma 6 provides the means for easily proving lower bounds. For 
example, consider the PS game against bah-opt For any market sequence, bah-opt invests 
its entire wealth in the best stock. Therefore, X) ATeJC" BAH-oPT(iL) = m, and m is a lower 
bound on the competitive ratio of any investment algorithm for this problem. Moreover, 
m is a tight bound since the algorithm that invests 1/m of its initial wealth in each of 
the m stocks achieves a competitive ratio of m. 

Similarly, for any Kelly market sequence K we have opr(iL) = 1. Therefore, as 
there are m” Kelly sequences of length n, we have, XiCe/C oPT(iT) = m" and thus 
m" is a lower bound on the competitive ratio of any online algorithm against opt. In this 
case, CBAL(i/^ 1 /^) achieves the optimal bound! 



5 How Well Can We Perform against CBAL-OPT 



Comparison against cbal-opt has received considerable attention in the literature. The 
best known results are summarized in Table 1. Using the lower bound technique from 
Section 3.2, we first establish the Ordentlich and Cover [OC96] lower bound. 

The following lemma is immediate but very useful. 

Lemma 7. Consider a Kelly market sequence W" = xi, ..., x„ over m stocks. We can 
represent as a sequence = x\, ...,Xn G {1,2...., m}”. Let cbaL), be a constant- 
rebalanced algorithm. Suppose that in the sequence W" there are rij occurrences of 
(stock) j with'^j rij = n. We say that such a Kelly sequence has type (ni,ri 2 , . . . 

For a sequence X" of type (ni, ri 2 , . . . , Um), the return R(h, X") of cbal{, on the 
sequence X„ is 



i?(b,X”) 






n? 



That is, the return of any cbal depends only only the type. 




180 



A. Borodin, R. El-Yaniv, V. Gogan 



The development in Section 6.2 gives an alternative and more informative proof in 
that it determines the optimal portfolio b* for any Kelly sequence x” of a given type; 
namely b* = (ni/n, . . . , rim/n). 

We then can apply Lemma 6 to the class C of cbal algorithms to obtain: 

Lemma 8. Let alg be any online PS algorithm. Then a lower bound for the competitive 
ratio of ALG against cbal-opt is 

L„; „ 



Proof Consider the case m = 2. There are clearly length n Kelly sequences 
X having type (ni,n2) = (ni,n — ni) and for each such sequence cbal-opt(W) = 




Using a mixture of switching sequences and a min-max analysis, Ordentlich and 
Cover provide a matching upper bound (for any market sequence of length n) showing 
that is the optimal bound for the competitive ratio in the context of a known horizon 

n. 



Table 1. The best known results w.r.t. CBAL 





m = 2 


m > 2 


Source 


Lower’ 


rj y^Trnl2 


m — i 

Vn ( n\—^ 

'm - V 2 I 


[OC96] 


Upper (known n) 


rf 


' m 


[OC96] 


Upper 


2\/n + 1 


2(n-M)^ 


[C096] 



From the discussion above it follows immediately that any one cbal (or any finite 
mixture of cbals) cannot be competitive relative to cbal-opt; indeed for any m > 2, the 
competitive ratio of any cbal against cbal-opt will grow exponentially in n. 

5.1 Cover’s Universal Portfolio Selection Algorithm 

The Universal Portfolio algorithms presented by Cover [Cov91] are special cases of 
the class of “/r-weighted” algorithms which we denote by w^. A rather intuitive un- 
derstanding of the /i-weighted algorithms was given by Cover and Ordentlich. These 
algorithms are parameterized by a distribution p, over the set of all portfolios B. Cover 
and Ordentlich show the following result: 

wealth of [wealth of CBALb] . 

^ The Gamma function is defined as P{x) = It can be shown that C(l) = 1 

and that P{x -I- 1) = xP{x). Thus if n > 1 is an integer, U(n -|- 1) = n!. Note also that 
r(i/2) = 0F. 
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This observation is interesting because the definition of (see [C096]) is in terms of a 
sequence of adaptive portfolios (depending on the market sequence) that progressively 
give more weight to the better-performing constant rebalanced portfolios. But the above 
observation shows that the return of these /i-weighted algorithms is equivalent to a “non- 
learning algorithm”. That is, a mixture of cbals specifies a randomized trading strategy 
that is in some sense independent of the stock market data. Of course, the composite 
portfolio determined by a mixture of cbals does depend on the market sequence. 

Cover and Ordentlich analyze two instances of One (called uni) that uses the 
uniform distribution (equivalently, the Dirichlet(l, distribution) and another 

(here simply called dir) that uses the Dirichlet( ^ ^ , . . . , | ) distribution. They prove 
that the uniform algorithm uni has competitive ratio 

( 1 ) 

\ m — 1 / 



and that this bound is tight. Somewhat surprisingly (in contrast, see the discussion 
concerning the algorithms in Sections 6.3- 6.5.) this bound can be extracted from uni 
by an adversary using only Kelly market sequences; in fact, by using a Kelly sequence 
X in which one fixed stock “wins” every day. For the case of m = 2, this can be easily 
seen since the return cbaL(j is 6" for n days and b'^db = 

Cover and Ordentlich show that the dir algorithm has competitive ratio 



r{l/2)r{n + m/2) 
r{m/2)r{n+l/2) 



. s m — 1 

< 2(n-P 1)^ . 



( 2 ) 



and here again the bound is tight and achieved using Kelly sequences. Hence for any 
fixed m, there is a constant gap between the optimal lower bound for fixed horizon and 
the upper bound provided by dir for unknown horizon. 

It is instructive to consider an elegant proof of the universality of uni (with a slightly 
inferior bound) due to Blum and Kalai [BK97]."^ Let B be the {m — 1) -dimensional 
simplex of portfolio vectors and let /x be any distribution over B. Recall that the return 
of the /x-weighted alg is a ^-weighted average of the returns of all CBALb algs. Let X be 
any market sequence of length n and let CBAUb* = cbal-opt. Say that b is “near” b* if 
b = for some z G B. Therefore, for each day i we have 



CBALb (Xi) > ^ 

n + I 



CBALb* 



So, for n days. 



CBALb* (2f) 
CBALb (2f) 




< e. 



Let Volm(-) denote the m-dimensional volume. 



4 



See also the web page http://www.cs.cmu.edur akalai/coltfinal/slides. 
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Under the uniform distribution over B, the probability that b is near b* is 



Pr[b near b*] 



V0l„_l (;^b* + iz) 

y0\ra-l{B) 

Vol„_i {^z) 




( \ m—1 

j fraction of the initial wealth is invested in cbal’s which are “near” 
CBAL-OPT, each of which is attaining a ratio e. Therefore, the competitive ratio achieved 
is e • (n + 1)™“^. 



5.2 Expert Advice and the eg Algorithm 

The EG algorithm proposed by Helmbold it et al [HSSW98] takes a different approach. It 
tries to move towards the cbal-opt portfolio by using an update function that minimizes 
an objective function of the form 



F\ht+i) =?7log(bt+i -xt) -d(bt+i,bt) 



where ii(b, b') is some distance or dissimilarity measure over distributions (portfolios) 
and t] is a. learning rate parameter. 

The competitive bound proven by Helmbold et al for eg is weaker than the bound ob- 
tained for UNI. However, eg is computationally much simpler than uni and experimentally 
it outperforms uni on the New York Stock Exchange data (see [HSSW98] and Table 3). 
The EG algorithm developed from a framework for online regression and a successful 
body of work devoted to predicting based on expert advice. When trying to select the best 
expert (or a weighting of experts), the eg algorithm is well motivated. It is trying to mini- 
mize a loss function based on the weighting of various expert opinions and in this regard 
it is similar to uni. However, it is apparent that cbal-opt does not make its money (over 
buy and hold) by seeking out the best stock. If one is maintaining a constant portfolio, 
one is selling rising stocks and buying falling ones. This strategy is advantageous when 
the falling stock reverses its trend and starts rising. We also note that in order to prove 
the universality of eg, the value of the learning rate rj decreases to zero as the horizon 
n increases. When 77 = 0 , eg degenerates to the uniform CBAL(i/m,i/m,...,i/m) which is 
not universal whereas the small learning rate (as given by their proof) of eg is sufficient 
to make it universal. It is also the case that if each day the price relatives for all stocks 
were identical, then (as one would expect) eg will again be identical to the uniform cbal. 
Hence when one combines a small learning rate with a “reasonably stable” market (i.e. 
the price ratives are not too erratic), we might expect the performance of eg to be similar 
to that of the uniform cbal and this seems to be confirmed by our experiments. 
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6 On Some Portfolio Selection Algorithms 

In this section we present and discuss several online portfolio selection algorithms. The 
DELTA algorithm is a new algorithm suggested by the goal of exploiting the rationale of 
constant rebalancing algorithms. The other algorithms are adaptations of known predic- 
tion algorithms. 

6.1 The DELTA Algorithm 

In this section we dehne what we call the delta ( r, w, f<,oii) algorithm. Informally, this 
online algorithm operates as follows. There is a risk parameter r between 0 and 1 con- 
trolling the fraction of a stocks value that the algorithm is willing to trade away on any 
given day. Each stock will risk that proportion of its weight if it is climbing in value 
and is sufficiently anti-correlated with other stock(s). If it is anti-correlated, the at-risk 
amount is spread amongst the falling stocks proportional to the correlation coefficient ^ . 
The algorithm takes two other parameters. The “window” length, w, specifies the length 
of history used in calculating the new portfolio. To take advantage of short term move- 
ments in the price relatives, a small window is used. Finally, < 0 is a correlation 
threshold which determines if a stock is sufficiently anti-correlated with another stock 
(in which case the weighting of the stock is changed). 

A theoretical analysis of the delta algorithm seems to be beyond reach, at this stage. 
In Sections 7 and 8 we present experimental studies of the performance of the delta 
algorithm. 

6.2 The Relation between Discrete Sequence Prediction and Portfolio Selection 

We briefly explore the well established relation between the portfolio selection problem 
and prediction of discrete sequences. We then discuss the use of some known prediction 
algorithms for the PS problem. 

Simply put, the standard worst case prediction game under the log-loss measure is 
a special case of the PS game where the adversary is limited to generating only Kelly 
market vectors. As mentioned in Section 5.1, Cover and Ordentlich showed that the PS 
algorithms uni and dir obtain their worst-case behavior over Kelly market sequences. 
However, this does not imply that the PS problem is reducible to the prediction problem, 
and we will see here several examples of prediction algorithms that are not competitive 
(against cbal-opt) in the PS context but are competitive in the prediction game. 

Here is a brief description of the prediction problem. (For a recent comprehensive 
survey of online prediction see [MF98].) In the online prediction problem the online 
player receives a sequence of observations X\,X 2 , ■ ■ ■ , Xt-i where the Xi are symbols in 
some alphabet of size m. At each time instance t the player must generate a prediction bt 
for the next, yet unseen symbol xt- The prediction is in general a probability distribution 

^ The correlation coefficient is a normalized covariance with the covariance divided by the 
product of the standard deviations; that is, 

Cor(X,Y) = Cov(X,Y)/(std(X)*std(Y)) where 
Cov(X,Y) = E[X-mean(X)) * (Y-mean(Y))]. 
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bt(xt) = bt(a:t|xi, . . . , Xt-i) over the alphabet. Thus, gives the confidence that the 
player has on each of the m possible future outcomes. After receiving xt the player 
incurs a loss of 1(6*, Xt) where I is some loss function. Here we concentrate on the log- 
loss function where l{ht, Xt) = — logb*(a;t). The total loss of a prediction algorithm, 
B = hi, b 2 , . . . , b„ with respect to a sequence X = xi,X 2 , ■ ■ ■ ,Xn L{B, X) = 
x;”=i^(b*,x*). 

As in the competitive analysis of online PS algorithms, it is customary to measure the 
worst case competitive performance of a prediction algorithm with respect to a compar- 
ison class of offline prediction strategies. Here we only consider the comparison class of 
constant predictors (which correspond to the class of constant rebalanced algorithms in 
the PS problem). The “competitive ratio”® of a strategy B with respect to the comparison 
class B is maxx[T(B, — inffegg L(b, AT)]. 

There is a complete equivalence between (competitive analysis of) online prediction 
algorithms under the log loss measure with respect to (offline) constant predictors and 
the (competitive analysis of) online PS algorithms with respect to (offline) constant 
rebalanced algorithms whenever the only allowable market vectors are Kelly sequences. 
To see that, consider the binary case m = 2 and consider the Kelly market sequence X = 
xi, . . . ,Xn where Xi € { 0 , 1} represents a Kelly market vector (xi = 0 corresponds to 
the Kelly market vector (J) and Xi = 1 corresponds to Cj*)). For ease of exposition we 
now consider (in this binary case) stock indices in { 0 , 1} (rather than {1,2}). Let cbalj, 
be the constant-rebalanced algorithm with portfolio ( 6 , 1 — 6 ). The return of cbal*, is 
R{b, X) — Y\a=i + (1 ~ Xi){l — 6 )). Let (no, ni) be the type of X (i.e. in X there 
are no zeros and ni ones, no -f ni = n.) Taking the base 2 logarithm of the return and 
dividing by n we get 



- log Rib, X) = — log 6 -f — log(l - 6) 
n n n 

no 1 ni 1 

= log log 

n 0 n 1 — 0 

( no , no/n ni ni/n 

= - — log — — -f — log 

\n 6 n (1 — 6) 



/no , no ni ni\ 

— log ^ log — 

V n n n n / 



= -Dkl 



11 ( 6 , 1 - 6)1 . 

\ n n / J \ n n J 



( 3 ) 



As the KL-divergence Dkl{'\\') is always non-negative, the optimal offline choice 
for the constant-rebalanced portfolio (cbal-opt) , that maximizes - log R{b, X), is 6* = 
noln, in which case the KL-divergence vanishes. 

We now consider the competitive ratio obtained by an algorithm alg against 
CBAL-OPT. Using the above expression for the log-return of cbal-opt we get 

log CBAL-OPT — log ALG 

Tjf'^0 ni\ 

— nH I , — log ALG. 

V n n / 

® The more common term in the literature is “regret”. 



log 



CBAL-OPT 

ALG 




On the Competitive Theory and Practice of Portfolio Selection 



185 



Using Jensen’s inequality we have 

n 

-logALc(X) = -logj]^ + (1 - Xi){l - h)) 

i^l 
n 

= -^log {xA + (1 - x*)(l - bi)) 

i=l 
n 

< - ^ (x* log b^ + {l- Xt) log(l - b,)) 

n / 

= V ( a;* log ^ + (1 - Xi) log 

n n 

= '^Dkl [(a^o 1 - a:i)||(6i, 1 - bi)] + '^H{xi, 1 - Xi) 
2=1 2=1 

Since each of the entropies H{xi^l — Xi) = 0 (xi = 0 or Xi = 1), we have 

n 

-logALC(X) <'^DKL[{Xi,l- Xi)\\A,l-b,)] . 



^-Xj 

(1 - bi){l - Xi) 



Putting all this together, 

CBAL-OPT 

< 

2=1 



log 



ALG 



n 

< '^Dkl [{xi, 1 - Xt)\\A, 1 - bi)] - nH . 



In the prediction game the online player must generate a prediction bi for the zth bit of 
a binary sequence that is revealed online. The first, KL-divergence, term of the bound 
measures in bits the total redundancy or inefficiency of the prediction. The second, en- 
tropy term, measures how predictable the sequence X is. In order to prove a competitive 
ratio of C, against cbal-opt, it is sufficient to prove that 



y^^PKL [{xj, 1 - Xi)]]{bj, 1 - bj)] < logC-b nH . (4) 



If Xi = 1 this expression reduces to log and if Xi = 0, it reduces to log 
Therefore, the summation over all these KL-divergence terms can be expressed as the 
logarithm of Ui ^ where Zi = bi iff Xj = 1 and Zi = 1 — bi if Xi = 0. We thus define, 
for an online prediction algorithm, its “probability product” P corresponding to an input 
sequence to be Ui to prove a competitive ratio of C it is sufficient to prove that 

log(P) = log Ui iT < log C + nH (^, . A similar development to the above can 

be established for any m > 2. 



6.3 The Add-beta Prediction Rule 

Consider the following method for predicting the z -b 1®* bit of a binary sequence, based 
on the frequency of zeros and ones that appeared until (and including) the A round. 
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We maintain counts, Cj, j = 0,1. Such that Cq records the frequency of the zeros and 
Cl records the frequency of ones until and including round t. Based on these counts the 
(online) algorithm predicts that the f + P* bit will be 0 with probability 

t+i ^ <^0 + ^ 

2/3 + C* + ’ 

where the parameter beta is a non-negative real. This rule is sometimes called the add- 
beta prediction rule. The instance /3 = 1 is called Laplace law of succession (see 
[MF98,CT91,Krich98]), and the case (3 = 1/2 is known to be optimal in both dis- 
tributional and distribution free (worst case) settings [MF98]. In the case where there 
are m possible outcomes, 1, . . . , to, the add-beta rule becomes 

,*+i ^ Cl + f3 

to/ 3 -I- X)l<i<m C'j 

For the PS problem, one can use any prediction algorithm, such as the add-beta rule in 
the following straightforward manner. Assume that the price relatives are normalized 
everyday so that the largest price relative equals one. For each market vector X = 
Xi, ... ,Xm consider its Kelly projection K{X),m which all components are zero except 
for component arg max xt which is normalized to equal one^ . At each round such an 
algorithm yields a prediction for each of the to symbols. 



Algorithm MO: The first prediction-based PS online algorithm we consider is called mo 
(for “Markov of order zero”). This algorithm simply uses the add-beta prediction rule 
on the Kelly projections of the market vectors. For the prediction game (equivalently, 
PS with Kelly markets) one can show that algorithm mo with /3 = 1 is (n -f l)™”!- 
competitive; that is, it achieves an identical competitive ratio to uni, the uniform- weighted 
PS algorithm of Cover and Ordentlich (see Section 5.1). Here is a sketch of the analysis 
for the case to = 2. The first observation is that the return of mo is the same for all 
Kelly sequences of the same type. This result can be shown by induction on the length 
of the market sequence. It is now straightforward to calculate the “probability product” 
P{J) of MO with respect to a market sequence X of type J = (no, ni) (by explicitly 
calculating it for a sequence which contains no zeros followed by ni ones), which equals 
to Since log ( ” ) < nH (see [CT91]), we have 



logP(J)=log(^(n+l)(^”J) 



= log(n + 1) + log ( 

\ni 

< log(n-f 1) -f niJ ( — , — ) . 

V n n / 

’’ To simplify the discussion, randomly choose amongst the stocks that achieve the maximum 
price relative if there is more than one such stock. 
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Using inequality (4) and the development thereafter it follows that (with (3=1) algorithm 
MO is (n + 1) -competitive in the prediction game (or restricted PS with Kelly sequences) 
and it is not hard to prove a tight lower bound of n -f 1 on its competitive ratio using 
the Kelly sequence of all ones. One can quite easily generalize the above arguments for 
m > 2, and it is possible to prove using a similar (but more involved) analysis that mo 
based on the add-| rule achieves a competitive ratio of 2{n + l)(™~i)/2^ the same as 
the universal dir algorithm for the general PS problem. (See Merhav and Feder [MF98] 
and the references therein.) 

Despite the fact that algorithm mo is competitive in the online prediction game it is 
not competitive (nor even universal) in the unrestricted PS game against offline constant 
rebalanced portfolios. For the case m = 2 this can be shown using market sequences of 
the form 




It is easy to see that the competitive ratio is Cover and Gluss [CG86] show how 

a less naive learning algorithm is universal under the assumption that the set of possible 
market vectors is bounded. In doing so, they illustrate how their algorithm (based on the 
Blackwell [B156] approachability-excludability theorem) avoids the potential pitfalls of 
a naive counting scheme such as mo. 



Algorithm TO: One might surmise that the deficiency of algorithm mo (in the PS game) 
can attributed to the fact that it ignores useful information in market vectors. Like mo, 
algorithm to uses the add-beta rule but now maintains its counters as follows: 

= C*] + ^Og2{xt+lJ + 1) . 

Here we again assume that price relatives have been normalized so that on any given 
day the maximum is one. Clearly, to reduces to mo in the prediction game. Algorithm to 
is also not competitive. This can be shown using sequences of the form 




For sufficiently large t, it is clear that the behavior of to on the sequence will be similar 
to that of CBAL( 1 1 ), which is not competitive. 

6.4 Prediction-Based Algorithms: Lempel-Ziv Trading 

One can then suspect that the non-competitiveness of the add-beta variants (mo, to) is 
due to the fact that they ignore dependencies among market vectors (they record only 
zero-order statistics). In an attempt to examine this possibility we also consider the 
following trading algorithm based on the Lempel-Ziv compression algorithm [ZL78]. 
The Lz algorithm was also considered in the context of prediction (see Langdon [Lan83] 
and Rissanen [Ris83]). Feder [Fed91] and Feder and Gutman [FG92] consider the worst 
case competitive performance of the algorithm in the context of gambling. Feder shows 
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that the lz algorithm is universal with respect to the (offline) class of all finite state 
prediction machines. 

Like MO, the PS algorithm based on Lempel-Ziv is the lz prediction algorithm applied 
to the Kelly projections of the market vectors. Using the same nemesis sequence as for 
MO, namely 




it is easy to see that a lower bound for the competitive ratio of lz is and hence 

LZ is not competitive by our definition (although it might still be universal). 



6.5 Portfolio Selection Work Function Algorithm 

In his PhD thesis Ordentlich suggests the following algorithm (which is also a general- 
ization of the MO algorithm with (3 = 1/2). For the case m = 2 this algorithm chooses 
the next portfolio bj+i to be 



bt+i = 



t-l. 



l/2\ 

1 / 2 )' 



where h/ is the optimal constant rebalanced portfolio until time t. With its two compo- 
nents, one that tracks the optimal offline algorithm so far and a second which may be 
viewed as greedy, this algorithm can be viewed as kind of a work function algorithm 
(see [BE98]). 

Ordentlich shows that the sequence 




produces a competitive ratio of from this algorithm. 

We can slightly improve Ordentlich’s lower bound to I7(n®/^). We concatenate 

(Cj*))* ((p))* for i = 2 . . . fc to the end of Ordentlich’s sequence, with k = 6>(i/(n) so 
that the entire input sequence remains of length 0{n). 

It remains an open question as to whether or not Ordentlich’s algorithm is competitive 
(or at least universal). The lower bound shows that this algorithm is not as 

competitive as uni and dir. 



7 Experimental Results 

We consider three data sets as test suites for most of the algorithms considered in the 
relevant literature. * The first data set is the stock market data as first used by Cover 

* We do not present experiments for the dir algorithm nor for Ordentlich’s “work function 
algorithm”. Even though dir’s worst case competitive bound is better than that of uni, it has been 
found in practice (see [HSSW98]) that dir’s performance is worse than uni. It is computationally 
time consuming to even approximate dir and the work function algorithm. In the full version 
of this paper we plan to present experimental results for these algorithms. 
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[Cov91] and then Cover and Ordentlich [C096], Helmbold et al [HSSW98], Blum and 
Kalai [BK97] and Singer [Sin98].^ This data set contains 565 1 daily prices for 36 stocks 
in the New York Stock Exchange (NYSE) for the twenty two year period July 3’’'^, 1962 
to Dec 1984. The second data set consists of 88 stocks from the Toronto Stock 
Exchange (TSE), for the five year period Jan 4*^, 1994 to Dec 31^*, 1998. The stocks 
chosen were those that were traded on each of the 1258 trading days in this period. The 
final data set is for intra day trading in the foreign exchange (EX) market. Specifically, 
the data covers the bid-ask quotes between USD ($) and Japanese Yen, and between 
USD and German Marks (DM) for the one year period Oct U*, 1992 to Sep 30*^, 1993. 
As explained in Section 7.2, we interpret this data as 479081 price relatives for m = 2 
stocks (i.e. Yen and DM). 



7.1 Experiments on NYSE data 

All 36 stocks in the sample had a positive return over the entire 22 year sample. The 
returns ranged from a low of 3.1 to a high (bah-opt) of 54. Before running the online 
algorithms on pairs of stocks, we determined the cbal-opt of each stock when traded 
against cash and a 4% bond as shown in Table 2. 

Of the 36 stocks, only three benefited from an active trading strategy against cash. The 
winning strategy for the remaining 33 stocks was to buy and hold the stock. When cash 
was replaced by a 4% annualized bond, seven stocks (the ones highlighted in Table 2) 
benefited from active trading. It is interesting to note that the cbal-opt of all 36 stocks 
is comprised of a portfolio of just 5 stocks. This portfolio has a return of 25 1 . 



Stock 


Weight 


Comm Metals 


0.2767 


Espey 


0.1953 


Iroquois 


0.0927 


Kin Ark 


0.2507 


Mei Corp 


0.1845 



Eour of these five stocks are also the ones that most benefited from active trading by 
CBAL-OPT against a bond. The cbal-opt of the remaining 3 1 stocks is still a respectable 
69.9, beating bah-opt for the entire 36 stocks. 



Pairs of Stocks When looking at individual pairs of stocks, however, one finds a different 
story. Instead of just these five or seven stocks benefiting from being paired with one 
another, one finds that of the possible 630 pairs, almost half (44%) have a cbal-opt that 
is 10% or more than the return of the best stock. It is this fact that has encouraged the 
consideration of competitive-based online algorithms as this sample does indicate that 
many stock pairings can benefit from frequent trading. Of course, it can be argued that 
identifying a “profitable pair” is the real problem. 

^ According to Helmbold et al, this data set was originally generated by Hal Stem. We do not 
know what criteria was used in choosing this particular set of 36 stocks. 
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Stock 


BAH 


BND(4%) 


cbal-opt(0%) 


cbal-opt(4%) 


Dupont 


3.07 


2.41 


3.07 


3.18 


Kin Arc 


4.13 


2.41 


12.54 


18.33 


Sears 


4.25 


2.41 


4.25 


4.25 


Lukens 


4.31 


2.41 


4.31 


4.93 


Alcoa 


4.35 


2.41 


4.35 


4.42 


Ingersoll 


4.81 


2.41 


4.81 


4.81 


Texaco 


5.39 


2.41 


5.39 


5.39 


MMM 


5.98 


2.41 


5.98 


5.98 


Kodak 


6.21 


2.41 


6.21 


6.21 


SherWill 


6.54 


2.41 


6.54 


6.54 


GM 


6.75 


2.41 


6.75 


6.75 


Ford 


6.85 


2.41 


6.85 


6.85 


P and 9 


6.98 


2.41 


6.98 


6.98 


Pillsbury 


7.64 


2.41 


7.64 


7.64 


GE 


7.86 


2.41 


7.86 


7.86 


Dow Chem 


8.76 


2.41 


8.76 


8.76 


Iroquois 


8.92 


2.41 


9.81 


12.08 


Kimb Clark 


10.36 


2.41 


10.36 


10.36 


Fischbach 


10.70 


2.41 


10.70 


10.70 


IBM 


12.21 


2.41 


12.21 


12.21 


AHP 


13.10 


2.41 


13.10 


13.10 


Coke 


13.36 


2.41 


13.36 


13.36 


Espey 


13.71 


2.41 


14.88 


17.89 


Exxon 


14.16 


2.41 


14.16 


14.16 


Merck 


14.43 


2.41 


14.43 


14.43 


Mobil 


15.21 


2.41 


15.21 


15.21 


Amer Brands 


16.10 


2.41 


16.10 


16.10 


Pillsbury 


16.20 


2.41 


16.20 


16.20 


Arco 


16.90 


2.41 


16.90 


16.90 


JNJ 


17.22 


2.41 


17.22 


17.22 


Mei Corp 


22.92 


2.41 


22.92 


23.29 


HP 


30.61 


2.41 


30.61 


30.61 


Gulf 


32.65 


2.41 


32.65 


32.65 


Schlum 


43.13 


2.41 


43.13 


43.13 


Comm Metals 


52.02 


2.41 


52.02 


52.02 


Morris 


54.14 


2.41 


54.14 


54.14 



Table 2. Return of CBAL-OPT when traded against 0 % cash and a 4 % a bond. For example, 
when Dupont is balanced against cash (respectively, the bond),the return of CBAL-OPT = 3.07 
(respectively, 3.18). The seven stocks that profit from active trading against a bond have been 
highlighted. 
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We abbreviate this uniform cbal algorithm as ucBALm. The uniform buy and hold 
(denoted ubah™) and ucBALm algorithms give us reasonable (and perhaps more realistic) 
benchmarks by which to compare online algorithms; that is, while one would certainly 
expect a good online algorithm to perform well relative to the uniform bah, it also seems 
reasonable to expect good performance relative to ucbal since both algorithms can be 
considered as naive strategies. We found that for over half (51%) of the stock pairings, 
CBAL-OPT has a 10% or more advantage over the (1/2, 1/2) cbal. 

The finding that eg (with small learning rate) has no substantial (say, 1%) advan- 
tage over UCBAL 2 confirms the comments made in Section 5.2. Previous expositions 
demonstrated impressive returns; that is, where eg outperformed uni and can signifi- 
cantly outperform the best stock (in a pair of stocks). The same can now be said for 
UCBAL 2 . The other interesting result is that delta seems to do remarkably better than the 
UCBAL 2 algorithm; it is at least 10% better than ucbal2 for 344 pairs. In fact, delta does 
at least 10% better than cbal-opt a third of the time (204 pairs). In the full version of this 
paper we will present several other algorithms, some that expand and some that limit 
the risk. 



All Stocks Some of the algorithms were exercised on a portfolio of all 36 stocks. In 
order to get the most from the data and detect possible biases, the sample was split up 
into 10 equal time periods. These algorithms were then run on each of the segments. In 
addition, the algorithms were run on the reverse sequence to see how they would perform 
in a falling market. These results will be presented in the full paper. 



7.2 Experiments on the TSE and EX data 

The TSE and FX data are quite different in nature than the NYSE data. In particular, 
while every stock made money in the NYSE data, 32 of the 88 stocks in the TSE data lost 
money. The best return was 6.28 (Centra Inc.) and the worst return was .117 (Pure Gold 
Minerals). There were 15 stocks having non zero weight in cbal-opt, with three stocks 
(including Centra Inc.) constituting over 80% of the weight. Unlike its performance on 
the NYSE data, with respect to the TSE data, uni does outperform the online algorithms 
UCBAL, EG, MO and TO. It does not, however, beat delta and it is somewhat worse than 
UBAH. The FX data was provided to us in a very different form, namely as bid-ask quotes 
(as they occur) as opposed to (say) closing daily prices. We interpreted each “tick” as 
a price by taking the average value (i.e. (ask+bid)/2). Since each tick only represents 
one currency, we merged the ticks into blocks where each block is the shortest number 
of ticks for which each currency is traded. We then either ignored the bid-ask nature 
of the prices or we used this information to derive an induced (and seemingly realistic) 
transaction cost for trading a given currency at any point in time. The Yen decreased 
with a return of 0.8841 while the DM increased with a return of 1.1568. It should also 
be noted that the differences in prices for consecutive ticks is usually quite small and 
thus frequent trading in the context of even small transaction costs (i.e. spreads) can be 
a very poor strategy. Note that for this particular FX data, cbal-opt is simply bah-opt. 

Table 3 reports on the returns of the various algorithms for all three data sets with 
and without transaction costs. Without transaction costs, we see that simple learning 
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algorithms such as mo can sometimes do quite well, while more aggressive strategies 
such as DELTA and lz can have fantastic returns. 



8 Portfolio Selection with Commissions and Bid-ask Spreads 



Algorithmic portfolio selection with commissions is not so well studied. There can be 
many commission models. Two simple models are the flat fee model and proportional 
(or fixed rate) model. In some markets, such as foreign exchange there are no explicit 
commissions at all (for large volume trades) but a similar (and more complicated) effect 
is obtained due to buy-sell or bid-ask spreads. In the current reality, with the emerging 
Internet trading services, the effect of commissions on traders is becoming less significant 
but bid-ask spreads remain. The data for the FX market contains the bid-ask spreads. 
When we want to view bid-ask spreads as transaction costs we define fhe fransaction rale 
as (ask-bid)/(ask-tbid). The resulting (induced) fransaction cosls are quite non uniform 
(over time) ranging between .00015 and .0094 with a mean transaction rate of .00057. 

Table 3 presents the returns for the various algorithms for all data sets using different 
transaction costs. For the NYSE and TSE data sets, we used fixed transaction cost rates 
of .1% (i.e. very small) and 2% (more or less “full service”). Eor the EX data we both 
artificially introduced fixed rale cosls or used the actual bid-ask spreads as discussed 
above. 

Eor the simple but important model of fixed rate transaction costs, it is not too difficult 
to extend the competitive analysis results to reflect such costs. In particular, Blum and 
Kalai [BK97] extend the proof of uni’s competitiveness to this model of transaction 
costs. Suppose then that there is a transaction rate cost of c (0 < c < 1); that is, to buy 
(or sell) $d of a stock costs $(|)d or alternatively, we can say that all transaction costs 
are payed for by selling at a commission rate of c. Blum and Kalai prove that uni has a 
competitive ratio upper bounded by generalizing the bound in Equation 1 

in Section 5.1. Using their proof in Section 5.1, one can obtain a bound of 



CBALb* (X) 
CBALb(X) 




(l+c)n 






whenever b is near b* so that the competitive ratio is bounded above by g(H-c) . _|_ 

Blum and Kalai [BK97], and then Helmbold et.al [HSSW98] and Singer [Sin98] 
present a few experimental results which seem to indicate that although transaction costs 
are significant, it is still possible to obtain reasonable returns from algorithms such as 
UNI and EG. Indeed for the NYSE data, eg “beats the market” even in the presence of 2% 
transaction costs. Our experiments seem to indicate that transaction costs may be much 
more problematic than some of the previous results and the theoretical competitiveness 
(say of uni) suggests. Algorithms such as lz and our delta algorithm can sometimes have 
exceptionally good returns when there are no transaction costs but disastrous returns with 
(not unreasonable) costs of 2%. 

We have been recently informed by Yoram Singer that the experimental results for his adaptive 
7 algorithm are not correct and, in particular, the results with transaction costs are not as 
encouraging as reported in [Sin98]. 
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UBAH 


UCBAL 


DELTA 
(1,10, -.01) 


DELTA 

(.1,4, -.01) 


mo(.5) 


NYSE (0%) 


14.4973 


27.0752 


1.9 X 10“® 


326.585 


111.849 


NYSE (.1%) 


14.4901 


26.1897 


1.7 X 10°'^ 


246.224 


105.975 


NYSE (2%) 


14.3523 


13.9218 


1.6 X 10“^® 


1.15638 


38.0202 


TSE (0%) 


1.61292 


1.59523 


4.99295 


1.93648 


1.27574 


TSE (.1%) 


1.61211 


1.58026 


2.91037 


1.82271 


1.2579 


TSE (2%) 


1.59679 


1.32103 


9.9 X 10"°® 


.576564 


.962617 


FX (0%) 


1.02047 


1.0225 


22094.4 


3.88986 


1.01852 


FX(.1%) 


1.01996 


.984083 


1.6 X 10“^® 


5.7 X 10"°® 


.979137 


FX (2%) 


1.01026 


.475361 


1.1 X 10"®^^ 


8.2 X 10“®'^ 


.462863 


FX (bid-ask) 


1.02016 


.999421 


1.02 X 10"^® 


.00653223 


.994662 





to(.5) 


eg(.OI) 


LZ 


UNI 


CBAL-OPT 


NYSE (0%) 


27.0614 


27.0869 


79.7863 


13.8663 


250.592 


NYSE (.1%) 


26.1773 


26.2012 


5.49837 


13.8176 


NC 


NYSE (2%) 


13.9252 


14.6023 


3.5 X 10"^^ 


9.90825 


NC 


TSE (0%) 


1.59493 


1.59164 


1.32456 


1.60067 


6.43390 


TSE (.1%) 


1.58002 


1.58006 


.597513 


1.58255 


NC 


TSE (2%) 


1.32181 


1.34234 


1.5 X 10“°'^ 


1.41695 


NC 


FX (0%) 


1.0225 


1.0223 


716.781 


1.02181 


1.15682 


FX(.1%) 


.984083 


.984435 


1.9 X 10“®^ 


.996199 


1.15624 


FX (2%) 


.475369 


.51265 


1.6 X 10“®^^ 


.631336 


1.14525 


FX (bid-ask) 


.999422 


.999627 


1.04 X lO"!'^ 


1.00645 


1.15661 



Table 3. The returns of various algorithms for three different data sets using different transaction 
costs. Note that uni and CBAL-OPT have only been approximated. The notation NC indicates an 
entry which has not yet been calculated. 
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9 Concluding Remarks and Future Work 

From both a theoretical and experimental point of view, it is clear that a competitive based 
approach to portfolio selection is only just beginning to emerge. In contrast, the related 
topic of sequence prediction is much better developed and indeed the practice seems 
to closely follow the theory. There are, of course, at least two significant differences; 
namely that (first) sequence prediction is a special case of portfolio selection and (second) 
that transaction costs (or alternatively, bid-ask spreads) are a reality having a significant 
impact. In the terminology of metrical task systems and competitive analysis (see [BE98] 
and [BB97]), there is a cost to change states (i.e. portfolios). 

On the other hand, PS algorithms can be applied to the area of expert prediction. 
Specifically, we view each expert as a stock whose log loss for a given prediction can be 
exponentiated to generate a stock price. Applying a PS algorithm to these prices yields 
a portfolio which can be interpreted as a mixture of experts. (See Ordentlich [Or96] and 
Kalai, Chen, Blum and Rosenfeld [KCBR99].) 

Any useful online algorithm must at least “beat the market”; that is, the algorithm 
should be able to consistently equal and many times surpass the performance of the 
uniform buy and hold. In the absence of transaction costs, all of the online algorithms 
discussed in this paper were able to beat the market for the NYSE stock data. However, the 
same was not true for the TSE data, nor for the currency data. Furthermore, when even 
modest transaction costs (e.g. .1%) were introduced many of the algorithms suffered 
significant (and sometimes catastrophic) losses. This phenomena is most striking for 
the DELTA algorithm and for the lz algorithm which is a very practical algorithm in the 
prediction setting. 

Clearly the most obvious direction for future research is to understand the extent 
to which a competitive based theory of online algorithms can predict performance with 
regard to real stock market data. And here we are only talking about a theory that 
completely disregards feedback on the market of any successful algorithm. We conclude 
with a few questions of theoretical interest. 

1 . What is the competitive ratio for Ordentlich’s “work function algorithm” and for the 
Lempel Ziv PS algorithm? 

2. Can an online algorithm that only considers the Kelly projection of the market 
input sequence be competitive (or universal)? What other general classes of online 
algorithms can analyzed? 

3. How can we define a porffolio selection “learning algorithm”? Is there a “true learn- 
ing” PS algorithm that can attain the worst case competitive bounds of uni or dir? 

4. To what extent can PS algorithms utilize “side information”, as defined in 
Cover and Ordentlich [C096]? See the very promising results in Helmbold 
etal [HSSW98]. 

5. Determine the optimal competitive ratio against cbal-opt and against opt in the 
context of a fixed commission rate c. 

6. Develop competitive bounds within the context of bid-ask spreads. 

7. Continue the study of portfolio selection algorithms in the context of “short- 
selling”. (See Vovk and Watkins [VW98].) 

8. Consider other benchmark algorithms as the basis of a competitive theory. 
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Abstract. Andreev et al. [3] gave constructions of Boolean functions 
(computable by polynomial-size circuits) with large lower bounds for 
read-once branching program (l-b.p.’s): a function in P with the lower 
bound a function in quasipolynomial time with the lower 

bound and a function in LINSPACE with the lower bound 

2 n-iogn-o(i) -yYg point out alternative, much simpler constructions of 
such Boolean functions by applying the idea of almost fc-wise indepen- 
dence more directly, without the use of discrepancy set generators for 
large affine subspaces; our constructions are obtained by derandomizing 
the probabilistic proofs of existence of the corresponding combinatorial 
objects. The simplicity of our new constructions also allows us to observe 
that there exists a Boolean function in AC°[2] (computable by a depth 3, 
polynomial-size circuit over the basis {A,©, 1}) with the optimal lower 
bound 2"-'°s"-0(b for l-b.p.’s. 



1 Introduction 

Branching programs represent a model of computation that measures the space 
complexity of Turing machines. Recall that a branching program is a directed 
acyclic graph with one source and with each node of out-degree at most 2. Each 
node of out-degree 2 (a branching node) is labeled by an index of an input bit, 
with one outgoing edge labeled by 0, and the other by 1; each node of out-degree 
0 (a sink) is labeled by 0 or 1. The branching program accepts an input if there 
is a path from the source to a sink labeled by 1 such that, at each branching 
node of the path, the path contains the edge labeled by the input bit for the 
input index associated with that node. Finally, the size of a branching program 
is defined as the number of its nodes. 

While there are no nontrivial lower bounds on the size of general branching 
programs, strong lower bounds were obtained for a number of explicit Boolean 
functions in restricted models (see, e.g., [12] for a survey). In particular, for read- 
once branching programs (1-b.p. ’s) — where, on every path from the source to a 
sink, no two branching nodes are labeled by the same input index — exponen- 
tial lower bounds of the form 2^^^) were given for explicit n-variable Boolean 
functions in [17,18,5,7,8,16,10,6,4] among others. Moreover, [7, 8, 6,4] exhibited 
Boolean functions in AC° that require l-b.p.’s of size at least 
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After lower bounds of the form were obtained for l-b.p.’s, the natural 

problem was to find an explicit Boolean function with the truly exponential lower 
bound The first such bound was proved in [1] for the Boolean function 

computing the parity of the number of triangles in a graph; the constant factor 
was later improved in [16]. With the objective to improve this lower bound, Sav- 
icky and Zak [15] constructed a Boolean function in P that requires a 1-b.p. of 
size at least 2”“^'/", and gave a probabilistic construction of a Boolean function 
requiring a 1-b.p. of size at least Finally, Andreev et al. [3] pre- 

sented a Boolean function in LINSPACEflP/poly with the optimal lower bound 
2"“i°g"+‘^(i), and, by derandomizing the probabilistic construction in [15], a 
Boolean function in QP fl P/poly with the lower bound as well as 

a Boolean function in P with the lower bound here QP stands for 

the quasipolynomial time 

The combinatorics of l-b.p.’s is quite well understood: a theorem of Simon 
and Szegedy [16], generalizing the ideas of many papers on the subject, provides 
a way of obtaining strong lower bounds. A particular case of this theorem states 
that any 1-b.p. computing an r-mixed Boolean function has size at least 2’’ — 1. 
Informally, an r-mixed function essentially depends on every set of r variables 
(see the next section for a precise definition). The reason why this lower-bound 
criterion works can be summarized as follows. A subprogram of a 1-b.p. G„ 
starting at a node v does not depend on any variable queried along any path going 
from the source s of G„ to v, and hence v completely determines a subfunction of 
the function computed by G„. If Gn computes an r-mixed Boolean function /„, 
then any two paths going from s to r can be shown to query the same variables, 
whenever v is sufficiently close to s. Hence, such paths must coincide, i.e., assign 
the same values to the queried variables; otherwise, two different assignments to 
a set of at most r variables yield the same subfunction of /„, contradicting the 
fact that fn is r-mixed. It follows that, near the source, G„ is a complete binary 
tree, and so it must have exponentially many nodes. 

Andreev et al. [3] construct a Boolean function /„(xi, . . . , x„) in LINSPACEfl 
P /poly that is r-mixed for r = n — [log n] — 2 for almost all n. By the lower- 
bound criterion mentioned above, this yields the optimal lower bound f2{2’^ jn) 
for l-b.p.’s. A Boolean function in DTIME(2*°g ”)nP/poly that requires a 1-b.p. 
of size at least is constructed by reducing the amount of randomness 

used in the probabilistic construction of [15] to 0(log^ n) advice bits. Since these 
bits turn out to determine a polynomial-time computable function with the lower 
bound one gets a function in P with the lower bound by 

making the advice bits a part of the input. 

Both constructions in [3] use the idea of e-biased sample spaces introduced 
by Naor and Naor [9], who also gave an algorithm for generating small sample 
spaces; three simpler constructions of such spaces were later given by Alon et 
al. [2] . Andreev et al. define certain e-discrepancy sets for systems of linear equa- 
tions over GF(2), and relate these discrepancy sets to the biased sample spaces 
of Naor and Naor through a reduction lemma. Using a particular construction 
of a biased sample space (the powering construction from [2]), Andreev et al. 
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give an algorithm for generating e-discrepancy sets, which is then used to de- 
randomize both a probabilistic construction of an r-mixed Boolean function for 
r = n — [logn] — 2 and the construction in [15] mentioned above. 

Our results. We will show that the known algorithms for generating small e- 
biased sample spaces can be applied directly to get the r-mixed Boolean function 
as above, and to derandomize the construction in [15]. The idea of our first 
construction is very simple: treat the elements (bit strings) of an e-biased sample 
space as the truth tables of Boolean functions. This will induce a probability 
distribution on Boolean functions such that, on any subset A oi k inputs, the 
restriction to A of a Boolean function chosen according to this distribution will 
look almost as if it were a uniformly chosen random function defined on the set 
A. By an easy probabilistic argument, we will show that such a space of functions 
will contain the desired r-mixed function, for a suitable choice of parameters e 
and k. 

We indicate several ways of obtaining an r-mixed Boolean function with r = 
n — [logn] — 2. In particular, using Razborov’s construction of e-biased sample 
spaces that are computable by AC°[2] formulas [11] (see also [13]), we prove 
that there are such r-mixed functions that belong to the class of polynomial-size 
depth 3 formulas over the basis {&,©,!}. This yields the smallest (nonuniform) 
complexity class known to contain Boolean functions with the optimal lower 
bounds for l-b.p.’s. (We remark that, given our lack of strong circuit lower 
bounds, it is conceivable that the characteristic function of every language in 
EXP can be computed in nonuniform AC*^[6].) 

In our second construction, we derandomize a probabilistic existence proof 
in [15]. We proceed along the usual path of derandomizing probabilistic algo- 
rithms whose analysis depends only on almost /c-wise independence rather than 
full independence of random bits [9]. Observing that the construction in [15] 
is one such algorithm, we reduce its randomness complexity to 0(log^ n) bits 
(again treating strings of an appropriate sample space as truth tables). This gives 
us a DTIME(2‘^*^*°s "))-computable Boolean function of quasilinear circuit-size 
with the lower bound for l-b.p.’s slightly better than that for the correspond- 
ing quasipolynomial-time computable function in [3] , and a Boolean function in 
quasilinear time, QL, with the lower bound for l-b.p.’s at least which 

is only slightly worse than the lower bound for the corresponding polynomial- 
time function in [3] . In the analysis of our construction, we employ a combinato- 
rial lemma due to Razborov [11], which bounds from above the probability that 
none of n events occur, given that these events are almost A:- wise independent. 

The remainder of the paper. In the following section, we state the necessary 
definitions and some auxiliary lemmas. In Section 3, we show how to construct an 
r-mixed function that has the same optimal lower bound for 1-b.p. as that in [3], 
and observe that such a function can be computed in AC°[2]. In Section 4, we 
give a simple derandomization procedure for a construction in [15], obtaining two 
more Boolean functions (computable in polynomial time and quasipolynomial 
time, respectively) that are hard with respect to l-b.p.’s. 
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2 Preliminaries 

Below we recall the standard definitions of fc-wise independence and (e, k)- 
independence. We consider probability distributions that are uniform over some 
set S C {0, 1}”; such a set is denoted by and called a sample space. 

Let Sn be a sample space, and let X = xi ... Xn he a, string chosen uniformly 
from Sn. Then is k-wise independent if, for any k indices < i 2 < ■ ■ ■ < ik 
and any k-hit string a, we have Pr[xi^Xi.^ . . .Xi^ = a] = 2“^. Similarly, for Sn 
and X as above, Sn is (e, k) -independent if |Pr[xi^Xi 2 . . . Xi^, = a] — 2~^\ ^ e for 
any k indices i\ < i 2 < ■ ■ ■ < ik and any fc-bit string a. 

Naor and Naor [9] present an efficient construction of small (e, /c)-independent 
sample spaces; three simpler constructions are given in [2]. Here we recall just 
one construction from [2], the powering construction, although any of their three 
constructions could be used for our purposes. 

Consider the Galois field GF(2’”) and the associated m-dimensional vector 
space over GF(2). For every element u of GF(2™), let bin(u) denote the corre- 
sponding binary vector in the associated vector space. The sample space Pow^ 
is defined as a set of Wbit strings such that each string u is determined as 
follows. Two elements x,y G GF(2’”) are chosen uniformly at random. For each 
1 ^ i ^ N, the tth bit iVi is defined as (bin(cc*), bin(y)), where (a, b) denotes the 
inner product over GF(2) of binary vectors a and b. 

Lemma 1 ([2]). The sample space Pow^ is , k) -independent for every k ^ 

N. 

As we have mentioned in the introduction, we shall view the strings of the 
sample space Pow^ as the truth tables of Boolean functions of log N variables. 
It will be convenient to assume that is a power of 2, i.e., = 2". Thus, the 

uniform distribution over the sample space Pow^T induces a distribution F„ ^ 
on Boolean functions of n variables that satisfies the following lemma. 

Lemma 2. Let A be any set of k strings from {0, 1}", for any k ^ 2”. Let </> be 
any Boolean function defined on A. For a Boolean function f chosen according 
to the distribution F„ defined above, we have |Pr [/|24 = </>]— 2~^\ ^ 
where f\^ denotes the restriction of f to the set A. 

Proof: The k strings in A determine k indices i\, . . . ,ik in the truth table of /. 
The function (f is determined by its truth table, a binary string a of length k. 
Now the claim follows immediately from Lemma 1 and the definition of (e, k)- 
independence. ■ 

Razborov [11] showed that there exist complex combinatorial structures (such 
as the Ramsey graphs, rigid graphs, etc.) of exponential size which can be 
encoded by polynomial-size bounded-depth Boolean formulas over the basis 
{&,©,!}. In effect, Razborov gave a construction of e-biased sample spaces 
(using the terminology of [9]), where the elements of such sample spaces are 
the truth tables of AG° [2]-computable Boolean functions chosen according to a 
certain distribution on AG°[2]-formulas. We describe this distribution next. 
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For n,m,l G N, a, random formula F(n, m, 1) of depth 3 is defined as 

F(n, ni, 1) = ©c — 1 ((©^=1 © Aq.^) , (1) 

where {Xa/s, Xap-y} is a collection of {n + l)ml independent random variables 
uniformly distributed on {0, 1}. The following lemma shows that this distribution 
determines an e-biased sample space; as observed in [13], a slight modification 
of the above construction yields somewhat better parameters, but the simpler 
construction would suffice for us here. 

Lemma 3 ([11]). Let fc, l,m G N he any numbers such that k ^ 2™“^, let A he 
any set of k strings from {0, 1}", and let (p he any Boolean function defined on 
A. For a Boolean function f computed by the random formula F(ri,m,l) defined 
in (1), we have jPr[/j^ = (j>] — 2“^j ^ , where f\A denotes the restriction 

of f to the set A. 

The proof of Lemma 3 is most easily obtained by manipulating certain dis- 
crete Fourier transforms. We refer the interested reader to [11] or [13] for details. 

Below we give the definitions of some classes of Boolean functions hard for 1- 
b.p.’s. We say that a Boolean function /„(xi, . . . , x„) is r-mixed for some r ^ n if, 
for every subset X ofr input variables {xi ^ , • ■ • , Xi^}, no two distinct assignments 
to X yield the same subfunction of / in the remaining n — r variables. We shall 
see in the following section that an r-mixed function for r = n — [logn] — 2 
has a nonzero probability in a distribution Fn^m, where m G 0{n), and in the 
distribution induced by the random formula F(n, m, 1), where m G 0(log n) and 
I G poly(n). 

It was observed by many researchers that r-mixed Boolean functions are hard 
for l-b.p.’s. The following lemma is implicit in [17,5], and is a particular case of 
results in [7,16]. 

Lemma 4 ( [17,5,7,16]). Let fn{xi , . . . , Xn) be an r-mixed Boolean function, 
for some r Gi n. Then every 1-b.p. computing fn has size at least 2’’ — 1. 

Following Savicky and Zak [15], we call a function p : {0, 1}” — >■ {1, 2, . . . , n} 
(s, n, q)-complete, for some integers s, n, and q, if for every set / C {1, . . . , n} of 
size n — s we have 

1. for every 0-1 assignment to the variables Xi, i G I, the range of the resulting 

subfunction of (j> is equal to {1,2,..., n}, and 

2. there are at most q different subfunctions of </>, as one varies over all 0-1 

assignments to Xi, i G I. 

Our interest in (s, n, < 7 )-complete functions is justified by the following lemma; 
its proof is based on a generalization of Lemma 4. 

Lemma 5 ( [15]). Let p : (0, 1}" — >■ (1, 2, . . . , n} he an (s, n, q)-complete func- 
tion. Then the Boolean function fn(xi, ■ ■ ■ , x„) = x,p(^xi,...,xn) requires Tb.p. ’s of 
size at least 2'^~^fq. 
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The following lemma can be used to construct an (s, n, ( 7 )-complete function. 

Lemma 6 ([15]). Let A be at xn matrix over GF(2) with every txs submatrix 
of rank at least r. Let if : {0,1}* — >■ |l,2,...,n} be a mapping such that its 
restriction to every affine subset of {0, 1}* of dimension at least r has the range 
(1, 2, . . . , n|. Then the function 4>{x) = if{Ax) is (s, n, 2*)-complete. 

A probabilistic argument shows that a, t x n matrix A and a function if : 
(0, 1}* — >■ (1, 2, . . . , n| exist that satisfy the assumptions of Lemma 6 for the 
choice of parameters s,t,r G O(logn), thereby yielding a Boolean function that 
requires l-b.p.’s of size at least ") , Below we will show that the argument 

uses only limited independence of random bits, and hence it can be derandomized 
using the known constructions of (e, /c)-independent spaces. Our proof will utilize 
the following lemma of Razborov. 

Lemma 7 ([11]). Let I > 2k be any natural numbers, letO < 0,e < 1, and let 
Si, . . . ,Si be events such that, for every subset L C (1, . . . , ^} of size at most k, 
we have \Pv[/\i^jSi] — < e. Then Pr[A[^^fi] ^ (efc + 6^)- 

3 Constructing r-Mixed Boolean Functions 

First, we give a simple probabilistic argument showing that r-mixed functions 
exist for r = n — [logn] — 2. Let / be a Boolean function on n variables that 
is chosen uniformly at random from the set of all Boolean n-variable functions. 
For any fixed set of indices |ii, . . . , v| C {1, . . . , nj and any two fixed binary 
strings a = a\, . . . ,ar and j3 = j3\, . . . , fdr, the probability that fixing Xi^,. . . , Xi^ 
to a and then to [3 will give the same subfunction of / in the remaining n — r 
variables is 2~^, where k = 2"“’’. Thus, the probability that / is not r-mixed is 
at most (")2^’’2“^, which tends to 0 as n grows. 

We observe that the above argument only used the fact that / is random on 
any set of 2k inputs: those obtained after the r variables Xi^, . . . , Xi^ are fixed 
to a, the set of which will be denoted as Aa, plus those obtained after the same 
variables are fixed to f3, the set of which will be denoted as A^. This leads us to 
the following theorem. 

Theorem 1. There is an m G 0{n) for which the probability that a Boolean 
n-variable function f chosen according to the distribution is r-mixed, for 

r = n — [log n~\ — 2, tends to 1 as n grows. 

Proof: By Lemma 2, the distribution yields a function / which is equal 

to any fixed Boolean function <f defined on a set Aa U of 2k inputs with 
probability at most The number of functions (f that assume the 

same values on the corresponding pairs of elements a G Aa and b G A/s is 2*^. 
Thus, the probability that / is not r-mixed is at most ^[[^2^’’(2“*' -I- 2“^'"“”“*'^). 
If TO = (7 -I- (5) n for any 5 > 0, then this probability tends to 0 as n grows. ■ 

By definition, each function from F„_m can be computed by a Boolean circuit 
of size poly (n, to). It must be also clear that checking whether a function from 
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Fn,m, given by a 2m-bit string, is r-mixed can be done in LINSPACE. It follows 
from Theorem 1 that we can find an r-mixed function, for r = n — [log n] — 
2, in LINSPACE by picking the lexicographically first string of 2m bits that 
determines such a function. By Lemma 4, this function will have the optimal 
lower bound for l-b.p.’s, l7(2"/n). 

We should point out that any of the three constructions of small (e, k)- 
independent spaces in [2] could be used in the same manner as described above 
to obtain an r-mixed Boolean function computable in LINSPACE fl P/poly, for 
r = n — [logn] — 2. Applying Lemma 3, we can obtain an r-mixed function with 
the same value of r. 

Theorem 2. There are m G O(logn) and I G poly(n) for which the probability 
that a Boolean n-variable function f computed by the random formula F(n, m, /) 
defined in (1) is r-mixed, for r = n — [logn] — 2, tends to 1 as n grows. 

Proof: Proceeding as in the proof of Theorem 1, with Lemma 3 applied instead 
of Lemma 2, we obtain that the probability that / is not r-mixed is at most 
(n^22r(2-fc _|_ If = [logn] -I- 3 and I = (6 -I- 5)n^ for any <5 > 0, 

then this probability tends to 0 as n grows. ■ 

Corollary 1. There exists a Boolean function computable by a polynomial-size 
depth 3 formula over the basis {&, ©, 1} that requires a Tb.p. of size at least 
fl(f2T jn) for all sufficiently large n. 

4 Constructing (s, n, qr)-Complete Functions 

Let us take a look at the probabilistic proof (as presented in [15]) of the existence 
of a matrix A and a function with the properties assumed in Lemma 6. Suppose 
that a t X n matrix A over GF(2) and a function ip : {0, 1}* — >■ {1, 2, . . . , n} are 
chosen uniformly at random. For a fixed t x s submatrix B of A, if rank(i3) < r, 
then there is a set of at most r — 1 columns in B whose linear span contains 
each of the remaining s — r -I- 1 columns of B. For a fixed set R of such r — 1 
columns in B, the probability that each of the s — r-|- 1 vectors chosen uniformly 
at random will be in the linear span of R is at most (2'’“^/2*)'*“’'+^. Thus, the 
probability that the matrix A is “bad” is at most (”) 

For a fixed affine subspace H of {0, 1}* of dimension r and a fixed 1 ^ z ^ n, 
the probability that the range of ip restricted to H does not contain i is at most 
(1 — 1/n)^ . The number of different affine subspaces of {0, 1}* of dimension r is 
at most the number of different z’s is n. Hence the probability that ip is 

“bad” is at most — l/nY'~ ^ 

An easy calculation shows that setting s = [(2-1-5) log n] , t = [(3-1-5) log rz] , 
and r = [log n -I- 2 log log n -I- 6] , for any 5 > 0 and sufficiently large b (say, 5 = 3 
and 5 = 0.01 ), makes both the probability that A is “bad” and the probability 
that Ip is “bad” tend to 0 as n grows. 

Theorem 3. There are di,d 2 ,ds G N such that every (2“'^i ”, ^2 log^ n)- 
independent sample space over n'^^-bit strings contains both matrix A and func- 
tion Ip with the properties as in Lemma 6, for s,r,t € O(logn). 
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Proof: We observe that both probabilistic arguments used only partial inde- 
pendence of random bits. For A, we need a tn-bit string coming from an (e, k)- 
independent sample space with k = ts and e = ”, for a sufficiently large 

constant ci. Indeed, for a fixed t x s submatrix B of A and a fixed set R of 
r — 1 columns in B, the number of “bad” t x s-bit strings a filling B so that 
the column vectors in R contain in their linear span all the remaining s — r -|- 1 
column vectors of B is at most If A is 

chosen from the (e, fc)-independent sample space with e and k as above, then the 
probability that some fixed “bad” string a is chosen is at most 2“*® -|- e. Thus, 
in this case, the probability that A is “bad” is at most 

^ (2-(‘-’’+l)(®-’'+l) -g g2^r-l)(s+t-r+l)y 

Choosing the same s, t, and r as in the case of fully independent probability 
distribution, one can make this probability tend to 0 as n grows, by choosing 
sufficiently large ci. 

Similarly, for the function ip, we need a 2* [log n] -bit string from an (e, fc)- 
independent sample space with k = C2 log^ n and e = ", for sufficiently 

large constants C2 and C3. Here we view the truth table of ^ as a concatenation 
of 2* [log n] -bit strings, where each [log n] -bit string encodes a number from 
{1, . . . , n}. The proof, however, is slightly more involved in this case, and depends 
on Lemma 7. 

Let s, r, and t be the same as before. For a fixed affine subspace H C {0, 1}‘ 
of dimension r, such that H = {oi, . . . , ai} for I = 2”, and for a fixed 1 ^ z ^ n, 
let £j, 1 ^ j ^ I, he the event that ip{cij) = i when ip is chosen from the 
(e, /c)-independent sample space defined above. Then Lemma 7 applies with 0 = 
2-fiognl ^ yielding that the probability that ip misses the value z on the subspace 
H is 

^ J(efc + 2-'=r>°snl). (2) 

It is easy to see that the first term on the right-hand side of (2) is at most 
g-4iog n ^.^^ijen 5 = 3 in r). We need to bound from above the remaining two 
terms: and Using Stirling’s formula, one can show that 

the first of these two terms can be made at most ", by choosing C2 

sufficiently large. Having fixed C2, we can also make the second of the terms at 
most 2“^*°s ", by choosing C3 > C2 sufficiently large. It is then straightforward 
to verify that the probability that ip misses at least one value z, 1 ^ z ^ n, on 
at least one affine subspace of dimension r tends to 0 as n grows. I 

Using any efficient construction of almost independent sample spaces, for ex- 
ample, Pow^ with N = tn G 0{n log n) and m G 0(log^ n), we can find a matrix 
A with the required properties in DTIME(2‘^(*°s ")) searching through all 
elements of the sample space and checking whether any of them yields a desired 
matrix. Analogously, we can find the required function ip in DTIME(2*^*^*°8 "^), 





Almost fc-Wise Independence and Hard Boolean Functions 



205 



by considering, e.g., Pow^ with N' = 2*|"logn] and m' G O(log^n). Thus, 
constructing both A and ip can be carried out in quasipolynomial time. 

Given the corresponding advice strings of 0(log^ n) bits, -ip is computable 
in time polylog(n) and all elements of A can be computed in time npolylog(n). 
So, in this case, the function (p{x) = rp(Ax) is computable in quasilinear time. 
Hence, by “hard- wiring” good advice strings, we get the function fn{x) = a;0(x) 
computable by quasilinear-size circuits, while, by Lemmas 5 and 6, /„ requires 
l-b.p.’s of size at least for any e > 0 and sufficiently large n; these 

parameters appear to be better than those in [3]. By making the advice strings 
a part of the input, we obtain a function in QL that requires l-b.p.’s of size at 
least 

We end this section by observing that the method used above to construct an 
(s, n, <7)-complete Boolean function could be also used to construct an r-mixed 
Boolean function for r = n — 0(log n) by derandomizing Savicky’s [14] modifi- 
cation of the procedure in [15]. This r-mixed function is also determined by an 
advice string of length polylog(n), and hence can be constructed in quasipoly- 
nomial time. 

5 Concluding Remarks 

We have shown how the well-known constructions of small e-biased sample 
spaces [11,9,2] can be directly used to obtain Boolean functions that are ex- 
ponentially hard for l-b.p.’s. One might argue, however, that the hard Boolean 
functions constructed in Sections 3 and 4 are not “explicit” enough, since they 
are defined as the lexicographically first functions in certain search spaces. It 
would be interesting to find a Boolean function in P or NP with the optimal lower 
bound l7(2"/n) for l-b.p.’s. The problem of constructing a polynomial-time com- 
putable r-mixed Boolean function with r as large as possible is of independent 
interest; at present, the best such function is given in [15] for r = n — f2{^/ri). 
A related open question is to determine whether the minimum number of bits 
needed to specify a Boolean function with the optimal lower bound for l-b.p.’s, 
or an r-mixed Boolean function for r = n — [log n] — 2, can be sublinear. 
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Abstract. We study communication complexity in the model of Simul- 
taneous Messages (SM). The SM model is a restricted version of the 
well-known multiparty communication complexity model [CFL,KN]. Mo- 
tivated by connections to circuit complexity, lower and upper bounds on 
the SM complexity of several explicit functions have been intensively 
investigated in [PR,PRS,BKL,Aml,BGKL]. 

A class of functions called the Generalized Addressing Functions (GAF), 
denoted GAFc^fc, where G is a finite group and k denotes the number of 
players, plays an important role in SM complexity. In particular, lower 
bounds on SM complexity of GAFc,fc were used in [PRS] and [BKL] 
to show that the SM model is exponentially weaker than the general 
communication model [CFL] for sufficiently small number of players. 
Moreover, certain unexpected upper bounds from [PRS] and [BKL] on 
SM complexity of GAFc,fc have led to refined formulations of certain 
approaches to circuit lower bounds. 

In this paper, we show improved upper bounds on the SM complex- 
ity of GAF^t^fc. In particular, when there are three players (fc = 3), 
we give an upper bound of 0(n°'^®), where n = 2*. This improves a 
bound of 0(n^'^^) from [BKL]. The lower bound in this case is 
[BKL, PRS]. More generally, for the k player case, we prove an upper 
bound of improving a bound of from [BKL], 

where H{-) denotes the binary entropy function. For large enough k, 
this is nearly a quadratic improvement. The corresponding lower bound 
is /{k — 1)) [BKL, PRS]. Our proof extends some algebraic 

techniques from [BKL] and employs a greedy construction of covering 
codes. 

1 Introduction 

The Multiparty Communication Model: The model of multiparty commu- 
nication complexity plays a fundamental role in the study of Boolean function 
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complexity. It was introduced by Chandra, Furst, and Lipton [CFL] and has been 
intensively studied (see the book by Kushilevitz and Nisan [KN] and references 
therein) . In a multiparty communication game, k players wish to collaboratively 
evaluate a Boolean function f{xo, . . . ,Xk-i)- The t-th player knows each input 
argument except xi; we will refer to Xi as the input missed by player i. We can 
imagine input Xi written on the forehead of player i. The players communicate 
using a blackboard, visible to all the players. Each player has unlimited com- 
putational power. The “algorithm” followed by the players in their exchange of 
messages is called a protocol. The cost of a protocol is the total number of bits 
communicated by the players in evaluating / at a worst-case input. The multi- 
party communication complexity of / is then defined as the minimum cost of a 
protocol for /. 

The Simultaneous Messages (SM) Model: A restricted model of mulit- 
party communication complexity, called the Simultaneous Messages (SM) model, 
recently attracted much attention. It was implicit in a paper by Nisan and 
Wigderson [NW, Theorem 7] for the case of three players. The first papers inves- 
tigating the SM model in detail are by Pudlak, Rodl, and Sgall [PRS] (under the 
name “Oblivious Communication Complexity”), and independently, by Babai, 
Kimmel, and Lokam [BKL]. 

In the /c-party SM model, we have k players as before with input Xi of 
f{xo, . . . ,Xk-i) written on the forehead of the t-th player. However, in this 
model, the players are not allowed to communicate with each other. Instead, 
each player simultaneously sends a single message to a referee who sees none of 
the input. The referee announces the value of the function upon receiving the 
messages from the players. An SM protocol specifies how each of the players can 
determine the message to be sent based on the part of the input that player 
sees, as well as how the referee would determine the value of the function based 
on the messages received from the players. All the players and the referee are 
assumed to have infinite computational power. The cost of an SM protocol is 
defined to be the maximum number of bits sent by a player to the referee, and 
the SM- complexity of / is defined to be the minimum cost of an SM protocol 
for f on a worst-case input. Note that in the SM model, we use the ^oo-norm 
of the message lengths as the complexity measure as opposed to the £i-norm in 
the general model from [CFL] described above. 

The main motivation for studying the SM model comes from the observation 
that sufficiently strong lower bounds in this restricted model already have some of 
the same interesting consequences to Boolean circuit complexity as the general 
multiparty communication model. Moreover, it is proved in [PRS] and [BKL] 
that the SM model is exponentially weaker than the general communication 
model when the number of players is at most (logn)^“*^ for any constant e > 0. 
This exponential gap is proved by comparing the complexities of the Generalized 
Addressing Function (GAF) in the respective models. 

Generalized Addressing Function (GAF): The input to GAFg fc, where 
G is a group of order n, consists of n -I- (A: — 1) logn bits partitioned among the 
players as follows: player 0 gets a function xq ■ G — ^ {0, 1} (represented as an 
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n-bit string) on her forehead whereas players 1 through fc — 1 get group elements 
xi, . . . ,Xk-i, respectively, on their foreheads. The output of GAFc^fc on this 
input is the value of the function xg on a;i o . . . o x^-i, where o represents the 
group operation in G. Formally, 

GAFG,fc(a:o,a;i, . . .,Xk-i) := xq(xi o . . . o Xk-i). 

In [BKL], a general lower bound is proved on the SM complexity 
of GAFg^fc for any finite group G. In particular, they prove a lower bound 
17 — 1 )) for GAFz„_fc (he., G is a cyclic group) and GAFgt (i.e., 

G is a vector space over GF(2)). Pudlak, Rodl, and Sgall [PRS] consider the 
special case of GAF^t ^ and prove the same lower bound using essentially the 
same technique. 

Upper Bounds on SM complexity: While the Simultaneous Messages 
model itself was motivated by lower bound questions, there have been some 
unexpected developments in the direction of upper bounds in this model. We 
describe some of these upper bounds and their significance below. This paper is 
concerned with upper bounds on the SM complexity of GAF^t fc. 

The results of [BKL] and [PRS] include some upper bounds on the SM com- 
plexity of GAF^t and GAFz„_fc, respectively. In [BGKL], upper bounds are 
also proved on a class of functions defined by certain depth-2 circuits. This class 
included the “Generalized Inner Product” (GIP) function, which was a prime 
example in the study and applications of multiparty communication complexity 
[BNS,G,HG,RW], and the “Majority of Majorities” function. 

Babai, Kimmel, and Lokam [BKL] show an 0(n° ®^) upper bound for 
GAF^t 3 , i.e., on the 3-party SM complexity of GAFc,fc, when G = Z^. More 
generally, they show an upper bound for GAF^t Pudlak, Rodl, and 

Sgall [PRS] prove upper bounds for GAFz„^fe. They show an 0(nloglogn/logn) 
upper bound for k = 3, and an 0(n®/^) for k > clogn. (Actually, upper bounds 
in [PRS] are proved for the so-called “restricted semilinear protocols,” but it is 
easy to see that they imply essentially the same upper bounds on SM com- 
plexity.) These upper bounds are significantly improved by Ambainis [Ami] 

to O ^nlog^^^ n/2'/*°s"^ for k = 3 and to 0{rf) for an arbitrary e > 0 for 

k = 0((logn)°^*^)). Note that the upper bounds for GAF^t are much better 
than those for GAFz^^fc. It is interesting that the upper bounds, in contrast to 
the lower bound, appear to depend heavily on the structure of the group G. 
Specifically, the techniques used in [BKL] and in this paper for GAF^t^fc and 
those used in [PRS] and [Ami] for GAFz„,k seem to be quite different. 

Our upper bounds: In this paper, we give an 0(n°'^^) upper bound on the 
(3-player) SM complexity of GAF^t 3 , improving the upper bound of 0(n° ®^) 
from [BKL] for the same problem. For general k we show an upper bound of 
( 9 ( 7 j^f(i/( 2 fe- 2 )) QAFgt^fc improving the upper bound of from 

[BKL]. For large k, this is nearly a quadratic improvement. The lower bound 
on the SM complexity of GAF^t is 17 — 1)) . Our results extend 
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some of the algebraic ideas from [BKL] and employ a greedy construction of 
covering codes of {0, 1}”. 

Significance of Upper Bounds: Upper bounds are obviously useful in as- 
sessing the strength of lower bounds. However, upper bounds on SM complexity 
are interesting for several additional reasons. 

First of all, upper bounds on SM complexity of GAF have led to a refined 
formulation of a communication complexity approach to a circuit lower bound 
problem. Before the counterintuitive upper bounds proved in [PR,PRS,BKL], it 
appeared natural to conjecture that the fc-party SM complexity of GAF should 
be Q{n) when A: is a constant. In fact, proving an a;(n/loglogn) lower bound on 
the total amount of communication in a 3-party SM protocol for GAF would have 
proved superlinear size lower bounds on log-depth Boolean circuits computing an 
(n-output) explicit function. This communication complexity approach toward 
superlinear lower bounds for log-depth circuits is due to Nisan and Wigderson 
[NW] and is based on a graph-theoretic reduction due to Valiant [Va]. However, 
results from [PR,PRS,BKL] provide o(n/loglogn) upper hounds on the total 
communication of 3-party SM protocols for GAF^t^a and GAFz^^a and hence 
ruled out the possibility of using lower bounds on total SM complexity to prove 
the circuit lower bound mentioned above. On the other hand, these and sim- 
ilar functions are expected to require superlinear size log-depth circuits. This 
situation motivated a more careful analysis of Valiant’s reduction and a refined 
formulation of the original communication complexity approach. In the refined 
approach, proofs of nonexistence of 3-party SM protocols are sought when there 
are certain constraints on the number of bits sent by individual players as op- 
posed to lower bounds on the total amount of communication. This new approach 
is described by Kushilevitz and Nisan in their book [KN, Section 11.3]. 

Secondly, Pudlak, Rodl, and Sgall [PRS] use their upper bounds on restricted 
semilinear protocols to disprove a conjecture of Razborov’s [Ra] concerning the 
contact rank of tensors. 

Finally, the combinatorial and algebraic ideas used in designing stronger 
upper bounds on SM complexity may find applications in other contexts. For 
example, Ambainis [Ami] devised a technique to recursively compose SM pro- 
tocols and used this to improve upper bounds from [PRS]. Essentially similar 
techniques enabled him in [Am2] to improve upper bounds from [GGKS] on 
the communication complexity of fc-server Private Information Retrieval (PIR) 
schemes. 



1.1 Definitions and Preliminaries 

Since we consider the SM complexity of GAF^t only, we omit the subscripts 
for simplicity of notation. We describe the the 3-player protocol in detail. The 
definitions and results extend naturally to the /c-party case for general k. 

We recall below the definition of GAF from the Introduction (in the special 
case when G = Z 2 , and we write A for xq, the input on forehead of Player 0). 
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Definition 1. Assume n = 2^ for a positive integer Then, the function 
GAF{A,xi, . . . ,Xk-i), where A € {0, 1}" and Xi € {0, 1} is defined by 



GAF{A,xi,. . .,Xk-i) ■■= A{xi © • • • © Xk-i), 

where A is viewed as an i-input Boolean function A : {0, 1}^ — > {0, 1}. 

Note that Player 0 knows only (fc — 1) log n bits of information. This informa- 
tion can be sent to the referee if each of players i, for 1 < i < fc — 2, sends Xi+i 
and player k—1 sends xi to the referee. (This adds a logn term to the lengths of 
messages sent by these players and will be insignificant.) Thus, we can assume 
that player 0 remains silent for the entire protocol and that the referee knows all 
the inputs xi, . . . , Xk-i- The goal of players 1 through fc — 1 is to send enough 
information about A to the referee to enable him to compute A{x\ © • • • © Xk-i)- 
We will use the following notation: 

j=0 

The following estimates on A are well-known (see for instance, [vL, Theorem 
1.4.5]): 

Fact 1 For 0 < a < 1/2, e > 0, and sufficiently large t, 



The following easily proved estimates on the binary entropy function FI (x) = 
— ccloga; — (1 — x) log(l — x) will also be useful: 



Fact 2 (i) For |<5| < 1/2, 



1 - 



31n2 






< 




(ii) For k> 3 , F[{l/k) < log{ek)/k. ■ 

Our results extend algebraic techniques from [BKLj. The main observation 
in the upper bounds in that paper is that if the function A can be represented 
as a low-degree polynomial over GF(2), significant, almost quadratic, savings 
in communication are possible compared to the trivial protocol. We use the 
following lemma proved in [BKL] : 

Lemma 1 (BKL). Let f be an Gvariate multilinear polynomial of degree at 
most d over Z 2 . Then GAF{f,x,y) has an SM-protocol in which each player 
sends at most A(Ji, [d/2j) bits. 
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2 Simple Protocol 

For z G {0, 1}^, let \z\ denote the number of I’s in z. 

Lemma 2. Let A : {0, 1}^ — > {0, 1}. Then for each i, 0 < i < £, there is a 
multilinear polynomial ft of degree at most £/2 such that fi{z) = A{z) for every 
z with |z| = i. 

Proof: For every z G {0, 1}^, define the polynomial 

f n 1^1 - 

S,{x) := < 

I n “ X*), if |z| > e/2. 

I Zi=0 

Observe that if |x| = |z|, then Sz{x) = 1 if x = z and <5z(x) = 0 if x yf z. 

Now, define fi by 

/i(x) := A{z)S^{x). 

\z\=i 

Clearly, fi is of degree at most i/2, since <iz(x) is of degree at most e/2. Fur- 
thermore, when |x| = f, all terms, except the term A{x)5x{x), vanish in the sum 
defining fi, implying /i(x) = A{x). (Note that when |x| yf i, fi{x) need not be 
equal to A{x). ) ■ 

Theorem 1. GAF{A,x,y) can he computed by an SM-protocol in which each 
player sends at most eA{e,e/A} = bits. 

Proof: Players 1 and 2 construct the functions /i for 0 < i f corresponding to 
A as given by Lemma 2. They execute the protocol given by Lemma 1 for each 
fi to enable the Referee to evaluate fi{x + y). The Referee, knowing x and y, 
can determine |x -I- y\ and use the information sent by Players 1 and 2 for f\x+y\ 
to evaluate f\x+y\{x + y). By Lemma 2, f\x+y\{x + y) = A{x + y) = GAF{x + y), 
hence the protocol is correct. By Lemma 1, each of the players send at most 
A{e, di/2) bits for each 0 < i < £, where di = deg fi = min{z, £ — i} < £/2. The 
claimed bound follows using estimates from Fact 1. ■ 

3 Generalization 

We now present a protocol based on the notion of covering codes. The protocol 
in the previous section will follow as a special case. 

Definition 2. An (m, r)-covering code of length £ is a set of words {c \, . . . , Cm} 
in {0, 1}^ such that for any x G {0, 1}^, there is a codeword Ci such that d{x, cf) < 
r, where d{x,y) denotes the Hamming distance between x and y. 

Theorem 2. If there is an {m,r)~ covering code of length £, then there is an 
SM-protocol for GAF{A,x,y) in which each player sends at most mrA{£,r/2) 
bits. 
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Proof: Given an (m, r)-covering code {ci,...,Cm}, let us denote by Hij, the 
Hamming sphere of radius j around cf. 

H^j := {x G {0, 1}^ : d{x, Ci) = j}. 

For A : {0, 1}^ — >■ {0, 1}, it is easy to construct a multilinear polynomial fij 
that agrees with A on all inputs from Hij. More specifically, let us write, for 
each 1 < i < m and SC [£], 

■■= W Xk ■ (l-Xfc), 

{fceS:Cjfc=0} {feeS:Cifc = l} 

where denotes the fc’th coordinate of c^. Note that 5is{x) = 1 iff a; differs 
from Ci in the coordinates k G S. 

Now, define the polynomial fij by 

'■= X! ^(ci + s)5is(a;), 

|S|=i 

where s G {0, 1}^ denotes the characteristic vector of S' C [l\. It is easy to verify 
that fij{x) = A{x) for all x G Hij. Note also that deg < r. 

The players use the protocol given by Lemma 1 on for 1 < z < m and 

^ C j < r. The Referee can determine z and j such that (x + y) G Hij and 
evaluate fij{x + y) = A{x + y) from the information sent by the players. ■ 

Remark 1. 1. Trivially, the all-0 vector and the all-1 vector form a (2,£/2)- 
covering code of length 1. Thus Theorem 1 is a special case of Theorem 2. 

2. Assume £ = 2^, v a positive integer. The first-order Reed-Muller code 
'R{l,v) form a {21, {£ — •\/f)/2)-covering code of length £ [vL, Exercise 4.7.10]. 
Thus we have the following corollary. 

Corollary 1. There is an SM-protocol of complexity £{£ — \/£)A{£, {£ — '/tfjA). 



4 Limitations 

Theorem 3. Any SM-upper bound obtained via Theorem 2 is at least 
Proof: For any (m, r)-covering code of length £, we must trivially have 

mA{£,r)>2^ (1) 

Thus the SM-upper bound from Theorem 2 is at least 

,. CH.rl2) 

Let r = a£. Using Fact 1 in (2), the upper bound is at least 

£a2^C+H[a/2)-H{a)) ^ 

This is minimized at a = 1 — l!\/2 giving an upper bound of no less than 



0.29289. ..f20 



7284... f > ^0.728 
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5 Upper Bound for 3 Players 

We construct a protocol which gives an upper bound matching the lower bound 
of Theorem 3. First, we construct a covering code. 

Lemma 3. For any £,r, there is an {m,r)- covering code with m = 0{£ ) . 

Proof: We construct the code greedily. The first codeword ci can be arbitrary. 
Each next codeword Ci is chosen so that it maximizes the number of words x 
such that d{x,Ci) > r,d{x,C 2 ) > r, . . ., d{x,Ci-i) > r, but d{x,Ci) < r. We can 
find Ci by exhaustive search over all words. 

Let Si be the set of words x such that d{x,c\) > r, . . ., d{x,Ci) > r and Wi 
be the cardinality of Si. 

Claim. Wi+i < (1 — ^^i^)wi. 

Proof of Claim: For each x € Si, there are A{£, r) pairs {x, y) such that d{x, y) < 
r. The total number of such pairs is at most WiA{£,r). By pigeonhole principle, 
there is a y such that there are at least Wi numbers x G Si with d{x, y) < r. 

Recall that c^+i is chosen so that it maximizes the number of such x. Hence, 
there are at least newly covered words x G Si such that d(x,Ci+i) < r. 

This implies 



Wi+l < 1 - 



A{£,r) 



This proves the claim. 
We have 



u>o = 2^ 



Wi < I 1 — 



A{£,r) 



From 1 — a; < e ^ it follows that 



Wo = 1 - 



M^,r) 

2 ^ 



< e i^‘2F 

Let TO = In 2 £ + 1 . Then Wm < 1 . 

Hence, Wm = 0, i.e. {ci, . . . , Cm} is an (to, r)-covering code. ■ 

Theorem 4. There is an SM-protocol for CAF{A, x, y) with communication 
complexity ™ 

Proof: We apply Theorem 2 to the code of Lemma 3 and get a protocol with 
communication complexity 



mrA{£, r/2) = O \£ 



A{£,r) 



rA{£, r/2) = O I £r2 



A{£,r) 



Let r = ai, where a = 1 — l/-\/2 is the constant from Theorem 3. Then, using 
estimates from Fact 1, the communication complexity is at most 



V a£) 
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6 Upper Bounds for k Players 

In this section, we generalize the idea from Section 2 to the fc-player case. It 
appears that for large values of k, the simpler ideas from Section 2 already give 
nearly as efficient a protocol as can be obtained by generalizing Theorem 4. In 
other words, the covering code (for large k) from Theorem 4 can be replaced 
by the trivial (2, £/2)-covering code given by the all-0 and all-1 vectors (cf. 
Remark 1). 

The starting point again is a lemma from [BKL] generalizing Lemma 1: 

Lemma 4 (BKL). Let f be an i-variate multilinear polynomial of degree at 
most d over Z 2 . Then GAF{f, xi, . . . , Xk-i) has a k-player SM protoeol in which 
each player sends at most A{£, [d/{k — 1)J) bits. 

Theorem 5. GAF{A,xi, . . . ,Xk-i) can be computed by a k-player SM protocol 
in which each player sends at most £A{i,£l(2k — 2)) = bits. 

Proof: Similar to Theorem 1. Players 1 thorough k construct the polynomials 
fi, 0 < i < £, corresponding to T as given by Lemma 2. Each fi is of degree at 
most £f2. The players then follow the protocol from Lemma 4 for each of these 

f^■ ■ 

7 Conclusions and Open Problems 

We presented improved upper bounds on the SM complexity of of GAF^t For 
fc = 3, we prove an upper bound of improving the previous bound of 

from [BKL]. For general k, we show an upper bound of 
improving the bound of 0 {n^A/k)'j [BKL]. The first open problem is to 

improve these bounds to close the gap between upper and lower bounds on the 
SM complexity of GAF^t Recall that the lower bound is / {k — 1)) 

[BKL]. 

The second open problem concerns the SM complexity of GAFz„_fe, i.e., the 
generalized addressing function for the cyclic group. Note that the best known 
upper bounds for the cyclic group [Ami] are significantly weaker than the upper 
bounds we present here for the vector space Z 2 . The lower bounds for both 
GAF^t fc and GAFg^ ^ are /(^ ~ 1)) Etnd follow from a general result 

of [BKL] on SM complexity of GAFc.fc for arbitrary finite groups G. In contrast, 
techniques for upper bounds appear to depend on the specific group G. 
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Abstract. A framework for solving certain multidimensional parametric 
search problems in randomized linear time is presented, along with its 
application to optimization on matroids, including parametric minimum 
spanning trees on planar and dense graphs. 



1 Introduction 

In the multi-parameter minimum spanning tree problem, we are given an edge- 
weighted graph G = (V,E), where the weight of each edge e is an affine func- 
tion of a d-dimensional parameter vector A = (Ai,A 2 ,... ,Xd), he., w{e) = 
ao(e) -h The topology and weight of the minimum spanning tree 

are therefore functions of A. Let t:(A) be the weight of the minimum spanning 
tree at A. The goal is to find 



z*=max 2 ;(A). (1) 

Problem (1) arises in the context of Lagrangian relaxation. For example, 
Camerini et al. [5] describe the following problem. Suppose each edge e of G has 
an installation cost w(e) and d possible maintenance costs mi{e), one for each of 
k possible future scenarios, where scenario i has probability pi. Edge e also has a 
reliability qi{e) under each scenario i. Let E denote the set of all spanning trees 
of G. Minimizing the total installation and maintenance costs while maintaining 
an acceptable level of reliability Q under all scenarios is expressible as 

Ter 'ic(e) T ^ ^ Pirriii^edj | . — l,...,d/. (2) 

UeT V i=l / eeT J 

A good lower bound on the solution to (2) can be obtained by solving the 
Lagrangian dual of (2), which has the form (1) with oo(e) = + Pi'mi{e) 

and Oi(e) = - logqi{e) -b (log(5)/(|U| - 1)*. 

In this paper, we give linear-time randomized algorithms for the fixed-dimen- 
sional parametric minimum spanning tree problem for planar and dense graphs 
(i.e., those where m = 0(n^)). Our algorithms are based on Megiddo’s method 

* Supported in part by the National Science Foundation under grant CCR-9520946. 

* We also need A > 0; this can easily be handled by our scheme. 
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of parametric search [24], a technique that turns solutions to fixed-parameter 
problems (e.g., non-parametric minimum spanning trees) into algorithms for 
parametric problems. This conversion is often done at the price of incurring a 
polylogarithmic slowdown in the run time for the fixed-parameter algorithm. Our 
approach goes beyond the standard application of Megiddo’s method to eliminate 
this slowdown, by applying ideas from the prune-and-search approach to fixed- 
dimensional linear programming. Indeed, the mixed graph-theoretic/geometric 
nature of the our problems requires us to use pruning at two levels: geometrically 
through cuttings and graph-theoretically through sparsification 

History and New Results. The parametric minimum spanning tree problem is 
a special case of the parametric matroid optimization problem [14]. The La- 
grangian relaxations of several non-parametric matroid optimization problems 
with side constraints — called matroidal knapsack problems by Camerini et al. [5] 
— are expressible as problems of the form (1). More generally, (1) is among the 
problems that can be solved by Megiddo’s method of parametric search [23,24], 
originally developed for one-dimensional search, but readily extendible to any 
fixed dimension [26,10,4]. 

The power of the parametric search has been widely recognized (see, e.g., [2]). 
Part of the appeal of the method is its formulation as an easy-to-use “black box.” 
The key requirement is that the underlying fixed-parameter problem — i.e., the 
problem of evaluating 2 ;(A) for fixed A — have an algorithm where all numbers 
manipulated are affine functions of A. If this algorithm runs in time 0(T), then 
the parametric problem can be solved in time and if there is IT-processor, 

D-step parallel algorithm for the fixed parameter problem, the run time can be 
improved to 0{T{DlogW)‘^). In some cases, this can be further improved to 
0{T{D logFF)'^)**. This applies to the parametric minimum spanning tree 

problem, for which one can obtain D = O(logn) and W = 0(m), where n = \V\ 
and m = \E\. The (fixed-parameter) minimum spanning tree problem can be 
solved in randomized 0{m) expected time [21] and 0{ma{m,n)\oga{m,n)) 
deterministic time [7]. In any event, by its nature, parametric search introduces a 
log*^*-'^^ n slowdown with respect to the run time of the fixed-parameter problem. 
Thus, the algorithms it produces are unlikely to be optimal (for an exception to 
this, see [11]). 

Frederickson [20] was among the first to consider the issue of optimality in 
parametric search, in the sense that no slowdown is introduced and showed how 
certain location problems on trees could be solved optimally. Later, certain one- 
dimensional parametric search problems on graphs of bounded tree-width [18] 
and the one-dimensional parametric minimum spanning tree problem on planar 
graphs and on dense graphs [17,19] were shown to be optimally solvable. A key 
technique in these algorithms is decimation: the organization of the search into 
phases to achieve geometric reduction of problem size. This is closely connected 
with the prune-and-search approach to fixed-dimensional linear programming 



Note that the O-notation in all these time bounds hides “constants” that depend on 
d. The algorithms to be presented here exhibit similar constants. 
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[25,12,9]. While the geometric nature of linear programming puts fewer obstacles 
to problem size reduction, a similar effect can be achieved for graph problems 
through sparsification [15,16] a method that has been applied to one-dimensional 
parametric minimum spanning trees before [1,19]. 

Here we show that multi-parameter minimum spanning trees can be found 
in randomized linear expected time on planar and on dense graphs. Our pro- 
cedures use prune-and-search geometrically through cuttings [8], to narrow the 
search region for the optimal solution, as well as graph-theoretically through 
sparsification. More generally, we identify decomposability conditions that al- 
low parametric problems to be solvable within the same time bound as their 
underlying fixed-parameter problems. 

2 Multidimensional Search 

Let ft, be a hyperplane in let H be a convex subset of and let sign^(ft) 
be -1-1, 0, or —1, depending, respectively, on whether ft(A) < 0 for all \ G A, 
ft(A) = 0 for some A G H, or ft(A) > 0 for all X G A. Hyperplane ft is said to 
be resolved if signy^(ft) is known. An oracle is a procedure that can resolve any 
given hyperplane. The following result is known (see also [25]): 

Theorem 1 (Agarwal, Sharir, and Toledo [3]). Given a collection % of n 
hyperplanes in and an oracle B for A, it is possible to find either a hyperplane 
that intersects A or a simplex A that fully contains A and intersects at most n/2 
hyperplanes in H by making 0{d^logd) oracle calls. The time spent in addition 
to the oracle calls is n ■ log^”^ d. 

Corollary 1. Given a set TL of n hyperplanes and a simplex A containing A, 
a simplex A' intersecting at most n' elements of TL and such that A C A' C A 
can be found with 0(d^ log dlg(n/n')) calls to an oracle. 

If A denotes the set of maximizers of function z, we have the following [4]: 

Lemma 1. Locating the position of the maximizers of z relative to a given hyper- 
plane ft reduces to carrying out three {d— 1)- dimensional maximization problems 
of the same form as (1). 

3 Decomposable Problems 

Let m be the problem size. A decomposable optimization problem is one whose 
fixed-parameter version can be solved in two stages: 

— A 0(m)-time decomposition stage, independent of A, which produces a re- 
cursive decomposition of the problem represented by a bounded-degree de- 
composition tree D. The nodes of D represent subproblems and its root is 
the original problem. For each node v m. D, my is the size of the subproblem 
associated with v. The children of v are the subproblems into which v is 
decomposed. 
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— A 0(m)-time optimization stage, where the decomposition tree is traversed 
level by level from the bottom up and each node is replaced by a sparse sub- 
stitute. z(A) can be computed in 0(1) time from the root’s sparse substitute. 

Note that after the decomposition stage is done we can evaluate 2 (A) for multiple 
values of A by executing only the optimization stage. 

We will make some assumptions about the decomposition stage: 

(Dl) D is organized into levels, with leaves being at level 0, their parents at level 
1 , grandparents at level 2 , and so forth. Li will denote the set of nodes at 
level i. The index of the last level is k = k{n). 

(D2) There exists a constant a > 1, independent of m, such that \Li\ = 0{mla'‘) 
and mu = 0{a^) for each u G Li. 

We also make assumptions about the optimization stage: 

(01) For each v G Li, the solution to the subproblem associated with v can be 
represented by a sparse substitute of size 0(/3(z)), such that f3{i)ja'‘ < 1 / 7 * 
for some 7 > 1. For i = 0 the sparse substitute can be computed in 
0 ( 1 ) time by exhaustive enumeration, while for z > 0 the substitute for v 
depends only on the sparse substitutes for z;’s children. 

(02) The algorithm for computing the sparse substitute for any node v takes 
time linear in the total size of the sparse substitutes of v’s children. This 
algorithm is piecewise affine; i.e., every number that it manipulates is 
expressible as an affine combination of the input numbers. 

Note that (i) by (01), the total size of all sparse substitutes for level i is 0{m/ffi) 
and (ii) assumption ( 02 ) holds for many combinatorial algorithms; e.g., most 
minimum spanning tree algorithms are piecewise affine. 



4 The Search Strategy 

We now describe our approach in general terms; its applications will be presented 
in Section 5. We use the following notation. Given a collection of hyperplanes 
H in A{'H) denotes the arrangement of H; i.e., the decomposition of 
into faces of dimension 0 through d induced by "H [13]. Given T C R”*, Ar{T~L) 
denotes the restriction of % to L. 

Overview. The search algorithm simulates the bottom-up, level- by-level execu- 
tion of the fixed-parameter algorithm for all A within a simplex A C M.'^ known 
to contain A, the set of maximizers of z. The outcome of this simulation for any 
node V in the decomposition tree is captured by a parametric sparse substitute 
for V, which consists of (i) a decomposition of A into disjoint regions such that 
the sparse substitute for each region is unique and (ii) the sparse substitute for 
each region of the decomposition. Given a parametric sparse substitute for v, ob- 
taining the sparse substitute for v for any fixed A G A becomes a point location 
problem in the decomposition. 
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After simulating the fixed-parameter algorithm, we will have a description of 
all possible outcomes of the computation within A, which can be searched to 
locate some A* G A. For efficiency, the simulation of each level is accompanied 
by the shrinkage of A using Theorem 1. This is the point where we run the risk 
of incurring the poly logarithmic slowdown mentioned in the Introduction. We 
get around this with three ideas. First, we shrink A only to the point where the 
average number of regions in the sparse substitute within A for a node at level 
i is constant-bounded (see also [17]). Second, the oracle used at level i relies on 
bootstrapping: By Lemma 1, this oracle can be implemented by solving three 
optimization problems in dimension d — 1 . If parametric sparse substitutes for 
all nodes at level i — 1 are available, the solution to any such problem will not 
require reprocessing lower levels. 

The final issue is the relationship between node v’s parametric sparse substi- 
tute P{v) and the substitutes for v’s children. An initial approximation to the 
subdivision for P{v) is obtained by overlapping the subdivisions for the substi- 
tutes of the children of v. Within every face F of the resulting subdivision of 
A there is a unique sparse substitute for each of v’s children. However, F may 
have to be further subdivided because there may still be multiple distinct sparse 
substitutes for v within F. Instead of producing them all at once, which would 
be too expensive, we proceed in three stages. First, we get rough subdivisions of 
the F’s through random cuttings [8] . Each face of these subdivisions will contain 
only a relatively small number of regions of P(v). In the second stage, we shrink 
A so that the total number of regions in the rough subdivisions over all nodes 
in level i is small. Finally, we generate the actual subdivisions for all v. 



Intersection Hyperplanes. Consider a comparison between two values a(A) and 
6(A) that is carried out when computing the sparse certificate for v or one of 
its descendants for some A G By assumption (02), a(A) and 6(A) are affine 
functions of A; their intersection hyperplane is hab = {A : a(A) = 6(A)}. The 
outcome of the comparison for a given A-value depends only on which side of 
hab contains the value. Let F{v) consist of all such intersection hyperplanes for 
V. Then, there is a unique sparse substitute for each face of Aa(H(v)), since all 
comparisons are resolved in the same way within it. Thus, our parametric sparse 
substitutes consist of Aa(H(v)), together with the substitute for its faces. 

For fast retrieval of sparse substitutes, we need a point location data structure 
for Aa (2i(u)). The complexity of the arrangement and the time needed to build it 
are 0{n‘^), where n is the number of elements of F{v) that intersect the interior of 
A [13]. A point location data structure with space requirement and preprocessing 
time 0{n‘^) can be built which answers point location queries in 0(log n) time [6] . 

We will make certain assumptions about intersection hyperplanes. Let F be 
a face of Aa(X}{H{u) : u is a child of u|). Then, we have the following. 

(HI) The number of elements of F{v) that intersect the interior of F is 0{ml). 
(H2) A random sample of size n of the elements of I{v) that intersect the interior 
of F can be generated in 0(n) time. 
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Shrinking the Search Region. Let ka(v) denote the number of faces of Aa(^(v)) 
and let Xa{v) denote the elements of X{v) that intersect the interior of A. The 
goal of the shrinking algorithm is to reduce the search region A so that, after 
simulating level r of the fixed-parameter algorithm, 

yl C A and E IIaMI (3) 

Lemma 2. If (3) holds, then i^a{v) < 2mfa^ . 

Lemma 3. Suppose we are given a simplex A satisfying (3) for r = i — 1 and 
parametric sparse substitutes within A for all v G Li_i. Then, with high prob- 
ability, 0{id‘^ log d) oracle calls and 0(mj overhead suffice to find a new 
simplex A satisfying (3) for r = i, together with Ia{v) for all v G Li. 

Proof. Let s = a* and for each v € Li let denote 1 J{Ia(m) : u a child of w}. 

To shrink the search region, first do the following for each v £ Li and each 
face F G A(?I(v)): ^From among the elements of T(v) that arise during the 
computation of a sparse certificate for some X G F choose uniformly at random a 
set Cf of size . The total number of elements in all the sets Cp is {m/s) ■ 

gi-i/(4d) _ gy theory of cuttings [8], with high probability any 

face in a certain triangulation of A{Cp), known as the canonical triangulation, 
intersects at most elements oflA("c)- 

Next, apply Corollary 1 to the set % = \Jy^p.{h : h G TLv or h G Cp,F a 
face of A{TLv)} to find a simplex A' that intersects at most m/s'^'^^ elements of 
% and such that A C A' C A. Set A 4— A'. Since \'H\ = the total 

number of oracle calls is 0(d‘* log dlog s) = Ofid'^ log d). 

Now, for each v G Li, we compute Ia{v) in two steps. First, for each v G Li 
construct the canonical triangulation Cy of the arrangement of Qy, which consists 
of all hyperplanes in U {/i G Cp : F a face of A{TLy)} intersecting A. The 
total time is 0{m/s), since at most m/s^^^ v’s have non-empty Qy’s and for 
each such v the number of regions in Ay (Gy) is 0(5"^) . Secondly, enumerate all 
hyperplanes in Xa{v) that intersect A'. This takes time 0{m/ With high 
probability, at most {m/s) ■ hyperplanes will be found. 

Finally, we apply Corollary 1 to the set % = {}^^^_Xa{v) to obtain a simplex 
A' intersecting at most m/ s^^^^ of the elements of TL and such that A C A' C A. 
Set A 4— A'. The total number of oracle calls is log dlog s) = 0{id'^logd). 

A Recursive Solution Scheme. We simulate the execution of the fixed-parameter 
algorithm level by level, starting at level 0; each step of the simulation produces 
parametric sparse certificates for all nodes within a given level. We use induction 
on the level i and the dimension d. For i = 0 and every d > 0, we compute the 
parametric sparse substitutes for all u G Li by exhaustive enumeration. This 
takes time 0{cdm), for some constant c^. 

Lemma 4. Let A be a simplex satisfying (3) for r = i — 1, and suppose that 
parametric sparse substitutes within A for all v G Li-i are known. Then we can. 
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with high probability, in time 0{fd ■ mfYd) fin'd a new simplex whithin which the 
parametric sparse substitute for the root of D has 0(1) regions. 

Proof. To process Li, i > 0, we use recursion on the dimension, d. When d = 0, 
we have the fixed-parameter problem. By (01), given sparse substitutes for all 
V G Li-i, we can process Li in time 0{(3{i) ■ m/a*) = 0(/om/7g), where /o = 1, 
7 o = 7 - Hence, Li through Lk can be processed in time 0{fQm/^}f). 

Next, consider d > 1. To process level i, we assume that level i — 1 has been 
processed so that (3) holds for r = i — 1. Our goal is to successively maintain 
(3) for r = i, t -I- 1, . . . , k. Thus, after simulating level k, \2A{root)\ = 0(1). 

The first step is to use Lemma 3 to reduce A to a simplex satisfying (3) 
for r = i and to obtain Ia{v) for all v G Li. The oracle will use the sparse 
certificates already computed for level i — 1: Let h be the hyperplane to be 
resolved. If /i fl A = 0, we resolve h in time 0(d) by finding the side of h that 
contains A. Otherwise, by Lemma 1, we must solve three (d — l)-dimensional 
problems. For each such problem, we do as follows. For every v G Li-\, find 
the intersection of h with tIa('c). This defines an arrangement in the (d — 1)- 
dimensional simplex A' = A fl d. The sparse substitute for each region of the 
arrangement is unique and known; thus, we have parametric sparse substitutes 
for all V G Li-i. By hypothesis, we can compute in time 0(/d_im/7^_j^). 

By Lemma 3, the time for all oracle calls is 0{i ■ d'^logd • fd-i ■ m/ 7 ^_^). If 
we discover that h intersects A, we return 2 ^. After shrinking A, La{v) will be 
known and we can build Aa{2(v)). By Lemma 2, this arrangement has 
regions. By assumption (01) we can find the sparse substitute for each face F 
of AA(T(r')) as follows. First, choose an arbitrary point A° in the interior of F. 
Next, for each child u of v, find the sparse substitute for u at A°. Finally, use 
these sparse substitutes to compute the sparse substitute for v at A°; this will 
be the sparse substitute for all A G A. Thus, the total time needed to compute 
the parametric sparse substitutes for v £ Li is 0{(3{i) ■ {m/a^)) = Olrfij^'^'). The 
oracle calls dominate the work, taking a total time of 0(i ■ gd ■ fd-i ‘ nifAd_fi) = 
0{fd- mljfi), where fd = gd- fd-i and jd satisfies 1 < 7 ^ < 7 ^- 1 . 

Theorem 2. The optimum solution to a decomposable problem can be found in 
0{m) time with high probability. 

Proof. Let w be the root of the decomposition tree. After simulating the exe- 
cution of levels 0 through k, we are left with a simplex A that is crossed by 
0(1) hyperplanes oiI{w), and 0(1) invocations of Theorem 1 suffice to reduce 
A to a simplex that is crossed by no hyperplane of X(w). Within this simplex, 
there is a unique optimum solution, whose cost as a function of A is, say, c • A. 
The maximizer can now be located through linear programming with objective 
function c • A. The run time is 0{ad), for some value ad that depends only on d. 

5 Applications 

We now show apply Theorem 2 to several matroid optimization problems. We 
will rely on some basic matroid properties. Let M = [S, I) be a matroid on the 
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set of elements S, where I is the set of independent subsets of S. Let B be a subset 
of S and let A be a maximum- weight independent subset of B. Then, there exists 
some maximum-weight independent set of S that does not contain any element 
of i? \ yl. Furthermore the identity of the maximum-weight independent subset 
of any such B depends on the relative order, but not on the actual values, of 
the weights of the elements of B. Therefore, the maximum number of distinct 
comparisons to determine the relative order of elements over all possible choices 
of A is Thus, (HI) is satisfied. Assumption (H2) is satisfied, since we can 
get a random sample of size n the intersection hyperplanes for B by picking n 
pairs of elements from B uniformly at random. 

Uniform and Partition Matroids. (This example is for illustration; both problems 
can be solved by linear programming.) Let S be an m-element set where every 
e £ S has a weight w{e) = ao(e) -I- let k < m he a fixed positive 

integer, and let z(A) = max^ ru(e). The problem is to find z* = 

minAz(A). The fixed-parameter problem is uniform matroid optimization. D is 
as follows: If [S'! < k, D consists of a single vertex v containing all the elements of 
S. Otherwise, split S into two sets of size m/2; D consists of a root v connected 
to the roots of trees for these sets. Thus, D satisfies conditions (Dl) and (D2). 

The non-parametric sparse substitute for node v consists of the k largest 
elements among the subset Sy corresponding to v. A sparse substitute for v 
can be computed from its children in 0{k) time. Thus, (02) is satisfied and, if 
fc = 0(1), so is (01). Hence, z* can be found in 0{m) time. 

In partition matroids, the set S is partitioned into disjoint subsets Si, . . . , Sr 
and z(A) = max{^{w(e) : e G B,\B (1 Si\ < 1 for t = 1,2, ...,r}. z* = 
min^ ^(A) can be computed in 0{m) time by similar techniques. 

Minimum Spanning Trees in Planar Graphs. D will represent a recursive separator- 
based decomposition of the input graph G. Every node u in O corresponds to a 
subgraph of G with Uy vertices; the root of D represents all of G. The chil- 
dren u\, . . . ,Ur oi V represent a decomposition of Gy into edge-disjoint subgraphs 
Gu^ such that < Uy fa for some a > 1, which share a set Xy of boundary 
vertices, such that |A„| = 0(^n„J. D satisfies conditions (Dl) and (D2) and 
can be constructed in time 0(n) where n is the number of vertices of G [22]. 

A sparse substitute for a node x with a set of boundary vertices X is obtained 
as follows: First, compute a minimum spanning tree of Gx- Next, discard all 
edges in E{Gx) \ E{Tx) and all isolated vertices. An edge e is contractible if it 
has a degree-one endpoint that is not a boundary vertex, or it shares a degree- 
two non-boundary vertex with another edge / such that cost(e) < cost(/). Now, 
repeat the following step while there is a contractible edge in Gy: choose any 
contractible edge e and contract it. While doing this, keep a running total of 
the cost of the contracted edges. The size of the resulting graph Hx is 0(|A|) = 
0{^/nf). Also, Hx is equivalent to Gx in that if the former is substituted for the 
latter in the original graph, then the minimum spanning tree of the new graph, 
together with the contracted edges constitute a minimum spanning tree of the 
original graph. The sparse substitute computation satisfies (01) and (02). 
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Minimum Spanning Trees in Dense Graphs. D is built in two steps. First, a 
vertex partition tree is constructed by splitting the vertex set into two equal-size 
parts (to within 1) and then recursively partitioning each half. This results in a 
complete binary tree of height Ign where nodes at depth i have njT vertices. 
^From the vertex partition tree we build an edge partition tree: For any two 
nodes x and y of the vertex partition tree at the same depth i containing vertex 
sets Vx and Vy, create a node Exy in the edge partition tree containing all edges 
of G in 14 X Vy. The parent of Exy is E^w, where u and w are, respectively, the 
parents of x and y in the vertex partition tree. An internal node Exy will have 
three children if x = y and four otherwise. D is built from the edge partition 
tree by including only non-empty nodes. 

Let u = Exy be a node in D. Let G„ be the subgraph of G with vertex set 
Vx U Vy and edge set En{Vx x Vy). For every j between 0 and the depth of D, (i) 
there are at most depth-j nodes, (ii) the edge sets of the graphs associated 
with the nodes at depth k are disjoint and form a partition of E, and (iii) if u is 
at depth j, G„ has at most n/2^ vertices and v? edges. If G is dense, then 
m„ = \E{Gu) \ = 0{\V{Gu)‘^\) for all u. Thus, (Dl) and (D2) hold. The sparse 
substitute for G„ is obtained by deleting from G„ all edges not in its minimum 
spanning forest (which can be computed in 0(|F(G„)^|) = 0{mu) time). The 
size of the substitute is 0(^m„). Thus, (01) and (02) are satisfied. 

6 Discussion 

Our work shows the extent to which prune-and-search can be used in parametric 
graph optimization problems. Unfortunately, the heavy algorithmic machinery 
involved limits the practical use of our ideas. We also suspect that our decompos- 
ability framework is too rigid, and that it can be relaxed to solve other problems. 
Finally, one may ask whether randomization is necessary. Simply substituting 
randomized cuttings by deterministic ones [6] gives an unacceptable slowdown. 
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1 Introduction 

Given an alphabet S = {oi, . . . , o„} and a corresponding list of weights [tci, . . . , 
Wn], an optimal prefix code is a prefix code for S that minimizes the weighted 
length of a code string, defined to be '^'i^iWik, where h is the length of the 
codeword assigned to Oi. This problem is equivalent to the following problem: 
given a list of weights [wi , . . . , Wn] , find an optimal binary code tree, that is, 
a binary tree T that minimizes the weighted path length where h is 

the level of the i-th leaf of T from left to right. If the list of weights is sorted, 
this problem can be solved in 0{n) by one of the efficient implementations of 
Huffman’s Algorithm [Huf52]. Any tree constructed by Huffman’s Algorithm is 
called a Huffman tree. 

In this paper, we consider optimal L-restricted prefix codes. Given a list of 
weights [wi, . . . ,Wn] and an integer L, with [logn] < A < n — 1, an optimal 
L-restricted prefix code is a prefix code that minimizes constrained to 

h < L for i = 1, . . . ,n. Gilbert [Gil71] recommends the use of these codes when 
the weights Wi are inaccurately known. Zobel and Moffat [ZM95] describe the 
use of word-based Huffman codes for compression of large textual databases. 
Their application allows the maximum of 32 bits for each codeword. For the 
cases that exceed this limitation, they recommend the use of L-restricted codes. 

Some methods can be found in the literature to generate optimal L-restricted 
prefix codes. Different techniques of algorithm design have been used to solve 
this problem. The first polynomial algorithm is due to Garey [Gar74]. The algo- 
rithm is based on dynamic programming and it has an 0{n^L) complexity for 
both time and space. Larmore and Hirschberg [LH90] presented the Package- 
Merge algorithm. This algorithm uses a greedy approach an runs in 0{nL) time, 
with 0{n) space requirement. The authors reduce the original problem to the 
Goin’s Gollector Problem, using a nodeset representation of a binary tree. Turpin 
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and Moffat [TM96] discuss some practical aspects on the implementation of the 

Package-Merge algorithm. In [Sch95], Schieber obtains an ^ ) 

time algorithm. Currently, this is the fastest strongly polynomial time algorithm 
for constructing optimal L-restricted prefix codes. Despite, the effort of some 
researchers, it remains open if there is an 0(n log n) algorithm for this problem. 

In this paper we give a linear time algorithm to recognize an optimal L- 
restricted prefix code. This linear time complexity holds under the assumption 
that the given list of weights is already sorted. If the list of weights is not sorted, 
then the algorithm requires an 0(n log n) initial sorting step. This algorithm is 
based on the nodeset representation of binary trees [LH90] . 

We assume that we are given an alphabet E = {oi, . . . , a„} with correspond- 
ing weights 0 < rci < • • • < an integer L with [logn] < L < n and a list 
of lengths 1= [h, . . . ,ln], where k is the length of the codeword assigned to a^. 
We say that I is optimal iff li,. . . ,ln are the codewords lengths of an optimal 
L-restricted prefix code for S. The Recognition algorithm that we introduce here 
determines if I is optimal or not. 

The paper is organized as follows. In section 2, we present the nodeset rep- 
resentation for binary trees and we derive some useful properties. In section 3, 
we present the Recognition algorithm. In section 4, we outline a proof for the 
algorithm correctness. 

2 Trees and Nodesets 

For positive integers i and h, let us define a node as an ordered pair (i,h), where 
i is called the node index and h is the node level. A set of nodes is called a 
nodeset. 

We define the background R{n, L) as the nodeset given by i?(n, L) = 
{(z, /i)|l < i < n, 1 < L < L} 

Let T be a binary tree with n leaves and with corresponding leaf levels 
h > ■ ■ ■ > In- The treeset N{T) associated to T is defined as the nodeset given 
by N{T) = {(z, /z)|l < L < /j, 1 < z < rz} 

The background in figure 1 is the nodeset i?(8, 5). The nodes inside the 
polygon are the ones of N (T) , where T is the tree with leaves at following levels: 
5,5,5,5,3,3,3,1. _ 

For any nodeset A C i?(rz, L) we define the complementary nodeset A as 
A = R(n, L) — A. In figure 1, the nodes outside of the polygon are those of the 
nodeset N{T). 

Given a node (i,h), define width{i,h) = 2~^ and weight{i, h) = Wi. The 
width and the weight of a nodeset, are defined as the sums of the corresponding 
widths and weights of their constituent nodes. Let T be a binary tree with rz 
leaves and corresponding leaf levels > . . . > ^„. It is not difficult to show 
[LH90] that width{N (T)) = rz — 1 and weight{N (T)) = ^11=1 ^ih- 

In [LH90] , Larmore and Hirschberg reduced the problem of finding an optimal 
code tree with restricted maximal height L, for a given list of weights w\, . . . , Wn 
to the problem of finding the nodeset with width rz — 1 and minimum weight 
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Fig. 1. The background -R(8, 5) and the treeset N{T) associated to a tree T with leaf 
levels 5, 5, 5, 5, 3, 3, 3,1 



included in the background i?(n, L). Here, we need a slight variation of the main 
theorem proved in [LH90]. 

Theorem 1. If a tree T is an optimal code tree with restricted maximum height 
L, then the nodeset associated to T has minimum weight among all nodesets with 
width n — 1 included in R{n,L). 

The proof of the previous theorem is similar to that presented in [LH90]. 
Therefore, the list 1= [h > ■ ■ ■ > h] is a list of an optimal L-restricted codeword 
lengths for S if and only if the nodeset N = {{i, h)\\ < i < n^ \ < h < U} has 
minimum weight among the nodesets in R(n, L) that have width equal to n — 1. 

In order to find the nodeset with width n—1 and minimum weight, Larmore 
and Hirschberg used the Package-Merge algorithm. This algorithm was proposed 
in [LH90] to address the following problem: given a nodeset R and a width d, 
find a nodeset X included in R with width d and minimum weight. The Package- 
Merge uses a greedy approach and runs in 0(|i?|), where |i?| denotes the number 
of nodes in the nodeset R. The Recognition algorithm uses the Package-Merge 
as an auxiliary procedure. 



3 Recognition Algorithm 

In this section, we describe a linear time algorithm for recognizing optimal L- 
restricted prefix codes. The algorithm is divided into two phases. 



3.1 First Phase 

First, the algorithm scans the list I to check if there is an index i such that 
Wi < Wi+i and h < k+i- If that is the case, the algorithm outputs that I is 
not optimal and stops. In this case, the weighted path length can be reduced 
interchanging 1^ and li+i- In the negative case, we sort I by non-increasing order 
of lengths. This can be done in linear time since all the elements of I are integers 
not greater than n. We claim that I is optimal if and only if the list obtained by 
sorting I is optimal. In fact, the only case where U < U+i is when Wi = Wi^\. In 
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this case, we can interchange U and maintaining the same external weighted 
path length. Hence, we assume that I is sorted with l\ > ■ ■ ■ > In- 

After sorting, the algorithm verifies if l\ > L. In the affirmative case, the 
algorithm stops and outputs “I is not optimal” . In the negative case, I respects 
the length restriction. Then, the algorithm verifies if X)r=i ~ This step 
can be performed in 0{n) since I is sorted. If ^ I, then the algorithm 

outputs “I is not optimal”. In effect, if X)r=i then it follows from 

McMillan-Kraft inequality [McM56] that there is not a prefix code with codeword 
lengths given by 1. On the other hand, if 2”^* < 1? then can be 

reduced by decreasing one unity the length of the longest codeword length /i. If 
2~*‘ = 1) then the algorithm verifies two additional conditions: 

1. li, ... ,ln are the codeword lengths of an unrestricted optimal prefix code for 

2. h = L. 

Condition 1 can be verified in 0(n) by comparing opti- 

mal weighted path length obtained by Huffman algorithm. Condition 2 can be 
checked in 0(1). We have three cases that we enumerate below: 

Case 1) Condition 1 holds. The algorithm outputs “I is optimal” and stops. 

Case 2) Both conditions 1 and 2 do not hold. The algorithm outputs I is not 
optimal and stops. 

Case 3) Only condition 2 holds. The algorithm goes to phase 2. 

In the second case, the external weighted path length can be reduced without 
violating the height restriction by interchanging two nodes that differs by at most 
one level in the tree with leaf levels given by 1. 

3.2 Second Phase 

If the algorithm does not stop at the first phase, then it examines the treeset 
N associated to the binary tree with leaf levels > . . . > l„. In this phase, 
the algorithm determines if N is optimal or not. Recall that it is equivalent to 
determine if I is optimal or not. 

First, we define three disjoint sets. We define the boundary F of a treeset N 
by 



F = {{i,h) € fV|(i,/i+l) i N} 

Let us also define F 2 by 

F 2 = {(*, h) G N — F\h < L — [logn] — 1 and (i + l,h) ^ N — F} 

Now, for h = L — [log n] — 1, . . . , L — 1, let i/j be the largest index i such that 
{i, h) G N — F. Then, define the nodeset M as 



M = {{i,h)\L- [lognl -l</r<L-l, 4 - 2 ^+b°g"l-^+i <i<ih}. 
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The way that the nodeset M is defined assures that M contains a nodeset 
with minimum weight among the nodesets included in {N — F) that have width 
equal to d, where d is a given diadic ^ number not greater than 

In figure 2, -R(14, 10) is the background. The polygon bounds the nodeset N. 
The nodes of the boundary F are the nodes inside the polygon that have the 
letter F written inside. The nodes of the nodeset M are those inside the polygon 
that have the letter M, while the nodes of the nodeset F 2 are those inside the 
polygon that have the number 2. 



Inside the Polygon 

® Nodes in F 
® Nodes in M 
® Nodes in F 2 



Outside the Polygon 

@ Nodes in U 
® Nodes in P 
® Nodes in U 2 
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Fig. 2. The nodesets used by the Recognition algorithm 



Now, we define three other disjoint nodesets. The upper boundary U of the 
nodeset N is defined by 



U={{i,h) G N\{i,h-1) G N} 

Let us also define U 2 by 

U 2 = {(*, h) G N — U\h < L — [logn] — 1 and {i — l,h) £ N UU} 

Now, ior h = L — [log n] — 1, . . . , L — 1, let be the smallest index i such 
that (t, h) belongs to U We define the nodeset P in the following way 

P={{i,h)\L- [lognl - 1 < i + 4}. 

In figure 2, the nodes of the upper boundary U are those outside the polygon 
that have the letter U written inside. The nodes of the nodeset P are those 

^ A diadic number is a number that can be written as a sum of integer powers of 2 
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outside the polygon that have the letter P, while the nodes of the nodeset U 2 
are those outside the polygon that have the number 2. 

The recognition algorithm performs three steps. The pseudo-code is presented 
in figure 3. 



Recognition Algorithm: Phase 2 ; 

1. Evaluate the width and the weight of the nodeset F VJ M VJ F^. 

2. Apply the Package-Merge algorithm to obtain the nodeset X with minimum weight 
among the nodesets included inE’UMuE' 2 Ut/UPUf /2 that have width equal to 
width{F U M U E 2 ). 

3. If weight{X) < weight{F U M U F 2 ), then outputs N is not optimal; otherwise 
outputs N is optimal 



Fig. 3. The second phase of the recognition algorithm. 



As an exemple, let us assume that we are interested to decide if the list of 
lengths i=[10, 10, 10, 10, 9, 9, 7, 6, 6, 6, 4, 3, 2, 1] is optimal for an alphabet S with 
weights [1,1,2,3,5,8,13,21,34,61,89,144,233,377] and L = 10. The nodeset 
N associated to I is the one presented in figure 2. The width of the nodeset 
FUMUF 2 is equal to 2-|-2“^-|-2“^ and its weight is equal to 1646. Let A be the 
nodeset {[(5, 10), (6, 10), (7, 9), (7, 8), (8, 7)]} and let B be the nodeset {(10, 6)}. 
The nodeset N U A — B has width 2 -|- 2“^ -|- 2“^ and weight 1645, and as a 
consequence, at step 2 the package-merge finds a nodeset with weight smaller 
than or equal to 1645. Therefore, the algorithms outputs that I is not optimal. 

3.3 Algorithm Analysis 

The linear time complexity of the algorithm is established below. 

Theorem 2. The Recognition algorithm runs in 0{n) time. 

Sketch of the Proof: The phase 1 can be implemented in 0(n) time as we 
showed in section 3.1. 

Let us analyze the second phase. Step 1 can be performed in linear time as 
we argue now. We define is{h) and ib{h), respectively, as the smallest and the 
biggest index i such that (i,h) G F U M U F 2 - It is easy to verify that ib(h) is 
given by the number of elements in I that are greater than or equal to h. On 
the other hand, is{h) can be obtained through ib{h) and the definitions of F, M 
and F 2 . Hence, the set {{is{L),ib{L)), ■ ■ ■ , (*s(l)j *&(!))} can be obtained in 0{n) 
time. Since the width of T’ U M U F 2 is given by Y^h^i{ib{h) — is{h) -b 1) x 2“^, 
it can be evaluated in linear time. 

Now, we consider step 2. We define i's{h) and i{(/i), respectively, as the small- 
est and the biggest index i such that {i, h) G F\JM\JF 2 ^U\JP\JU 2 . Observe that 
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is{h) = i's(h). Furthermore, can be obtained through ib{h) and the defini- 

tions of U, P and 1 / 2 - Hence, the set {(Zg(L), i^(L)), . . . , z's(l), zj,(l))} is obtained 
in 0{n) time. Now, we show that the number of nodes inFUMUF2UC/UPUf72 
is 0(n). We consider each nodeset separately. It follows from the definition of F 
and U that each of them has at most n nodes. In addition, both F 2 and U 2 have at 
most L— [log n] — 2 nodes. Furthermore, from the definitions of M and P one can 
show that |MUP| < 5n. Then it follows that IFUMUF 2 UC/UPUC/ 2 I < 7n+2L. 
Therefore, the package-merge runs in 0{n) time at step 2. 

The step 3 is obviously performed in 0(1) ■ 

The correctness of the algorithm is a consequence of the theorem stated 
below. 

Theorem 3. If N is not an optimal nodeset, then it is possible to obtain a new 
nodeset with width n — 1 and weight smaller than N by replacing some nodes in 
F U M U F 2 by other nodes in U \J P \Jll 2 - 

The sketch of the proof of theorem 3 is presented in the next section. The 
complete proof can be found in the full version of this paper. 

Now, we prove that theorem 3 implies on the correctness of the Recognition 
algorithm. 

Theorem 4. The second phase of the Recognition algorithms is correct. 

Proof: First, we assume that N is not an optimal nodeset. In this case, the 
theorem 3 assures the existence of nodesets A and B that satisfy the following 
conditions: 

(i) ACUUPUU 2 and B C FUMUF 2 ] 

(ii) width{A) = width{B) ; 

(iii) weight{A) < weight{B). 

These conditions imply that weight{FUMLlF 2 LlA—B) < weight{FUMLlF 2 ) 
and width{FUMLlF 2 LlA — B) = widht{FUMLlF 2 ). Let X be the nodeset found 
by package-merge at step 2 in the second phase. Since FUMUF 2 UA — B C 
FUMUF 2 UC/UPUC /2 , it follows that weight{X) < weight{F\J M\J F 2 A A — B) . 
Therefore, weight{X) < weight{FUMUF 2 ). Hence, the algorithm outputs that 
N is not optimal. 

Now, we assume that N is optimal. In this case, weight{X) > weight{F U 
MAJF 2 ), otherwise, NVJX — {FVJMAJ F 2 ) would have weight smaller than that 
of N, what would contradict our assumption. Hence, the algorithm outputs that 
N is optimal. ■ 



4 Correctness 

In this section, we outline the proof of theorem 3. We start by defining the 
concept of a decreasing pair. 
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Definition 1. If a pair of nodesets (A,B) satisfy the conditions (i)-(iii) listed 
below, then we say that {A, B) is a decreasing pair associated to N. 

(i) A C N and B C N; 

(ii) width{A) = width{B); 

(Hi) weight{A) < weight{B). 

For the sake of simplicity, we use the term DP to denote a decreasing pair 
associated to N. We can state the following result. 

Proposition 1. The nodeset N is not optimal if and only if there is a DP {A, B) 
associated to N. 

Proof: If N is not optimal, {N* — N,N — N*) is a DP, where N* is an 
optimal nodeset. If {A, B) is a DP associated to N, then N U A — B has width 
equal to that of N and has weight smaller than that of N. ■ 

Now, we define the concept of good pair (GP). 

Definition 2. A GP is a DP {A, B) that satisfies the following conditions 

(i) For every DP {A',B'), we have that width{A) < width(A') ; 

(ii) If A' C N and width(A') = width{A), then weight(A) < weight{A'); 
(Hi) If B' C N and width(B') = width{B), then weight{B) > weight(B'). 

Now, we state some properties concerning good pairs. 

Proposition 2. If N is not optimal, then there is a GP associated to N. 

Proof: If N is not optimal, then it follows from proposition 1 that there is at 
least one DP associated to N. Then, let d be the width of the DP with minimum 
width associated to N. Furthermore, let A* be a nodeset with minimum weight 
among the nodesets included in N that have width d, and let B* be the nodeset 
with maximum weight among the nodesets included in N that have width d. By 
definition, (A*,B*) is a GP. ■ 

Proposition 3. Let (A*,B*) be a GP. If A! C A* and B' C B* , then 
width(A') yf width{B'). 

Proof: Let us assume that A' G A* , B' C B* and width(A') = width(B'). 
In this case, let us consider the following partitions A* = A' G (A* — A') and 
B* = B' G{B* — B'). Since weight{A*) < weight{B*), then either weight(A') < 
weight(B') or weight{A* — A') < weight{B* — B'). li weight(A') < weight(B'), 
then (A', B') is a DP and width(A') < width{A*) , that contradicts the fact that 
(A*,B*) is a GP. On the other hand, if weight{A* — A') < weight{B* — B'), 
then (^* — A',B* — B') is a DP and width{A* — A') < width{A*), what also 
contradicts the fact that (A*,B*) is a GP. Hence, width(A') yf width(B') ■ 

Proposition 4. If (A*,B*) is a GP, then the following conditions hold 

(a) width{A*) = width{B*) < 2“^; 

(b) width{A*) = for some integer Si where 1 < si < L; 

(c) Either 1 = |H*| < \B*\ or 1 = \B*\ < |H*|. 




Linear Time Recognition of Optimal L-Restricted Prefix Codes 



235 



Proof: (a) Let us assume that width{A*) = width{B*) > 2“^. In this case, 
one can show, by applying the lemma 1 of [LH90] at most L times, that both A* 
and B* contain a nodeset with width 2~^. However, it contradicts proposition 
3. Hence, width{A*) = width{B*) < 2“^. 

(b) Now, we assume that width{A*) = 2“^% where 1 < si < S 2 • • • < Sfe 

and fc > 1. In this case, one can show that A* contains a nodeset A' with 
width 2“^i and B* contains a nodeset B' with width 2“'*i , that contradicts the 
proposition 3. Hence, k = 1 and width{A*) = 2“®L 

(c) First, we show that either 1 = |H*| or I = \B*\. Let us assume the 

opposite, that is, I < |H*| and I < \B*\. Since width{A*) = width{B*) = 2“^* 
for some 1 < Si < L, then one can show that both A* and B* contain a nodeset 
with width that contradicts the proposition 3. Hence, either I = |H*| 

or I = \B*\. Furthermore, we cannot have I = |H*| = \B*\. In effect, if I = 
M*| = 15*1, we would have weiqhtiA*) > weiqht(B*), and as a consequence, 
(A*,B*) would not be a GP. ■ 

The previous result allows us to divide our analysis into two cases. In the 
first case. A* has only one node. In the second case B* has only one node. We 
define two special pairs. The removal good pairs (RGP) and the addition good 
pairs (AGP). 

Definition 3. If {A* ,B*) is a GP, |A*| = 1 and for all GP {A, B), with |A| = 1, 
we have width{B* — F) < width{B — F). Then, {A*, B*) is a removal good pair 
(RGP). 

Definition 4. If {A*, B*) is a GP, \B*\ = 1 and for all GP {A, B), with \B\ = 1, 
we have width{A* — U) < width{A — U). Then, {A* ,B*) is an addition good 
pair (AGP). 

We can state the following result. 

Proposition 5. If there is a GP associated to N, then there is either a RGP 
associated to N or an AGP associated to N. 

Proof: The proof is similar to that of proposition 2. ■ 

Theorem 5. If there is a RGP {A,B) associated to N, then there is a RGP 
{A* ,B*) associated to N that satisfies the following conditions: 

(a) If {i, h) € A* , then {i — l,h) G N; 

(b) \B* - (FUM)I < 1; 

(c) If (z, h)G B* -{FG M), then {i+l,h) (fN - F. 

Proof: We leave this proof for the full version of the paper. The proof 
requires some additional lemmas and it uses arguments that are similar to that 
employed in the proof of proposition 4. ■ 

Theorem 6. If there is an AGP {A,B) associated to N, then there is an AGP 
{A* ,B*) associated to N that satisfies the following conditions: 

(a) If (z, h) G B* , then (z + 1, /z) G N; 

(b) I A* - (C/UF)| < 1; 

(c) If (z, h) G I A* — (C/ U P)|, then {i — 1, h) G N G U . 
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Proof: We leave this proof to the full paper. ■ 

Now, we prove the theorem 3 that implies on the correctness of the Recog- 
nition algorithm. 

Proof of theorem 3: If N is not an optimal nodeset, then it follows 
from proposition 2 that there is a GP associated to N . Hence, it follows from 
proposition 5 that there is either a RGP or an AGP associated to N . We consider 
two cases: 

Gase 1) There is a RGP associated to N. 

It follows from theorem 5 that there is a RGP associated to N 

that satisfies the conditions (a), (b) and (c) proposed in that theorem. ^From 
the definitions of the nodesets F, M, , U, P, U 2 it is easy to verify that those 
conditions imply that B* C F U M U F 2 and A* C C/ U P U C/ 2 . 

Gase 2) There is an AGP associated to N. The proof is analogous to that of 
case 1. 

Hence, we conclude that if N is not optimal, then there is a DP (A*,B*) 
such that B* C FLIMLIF 2 e A* C C/UPUC/ 2 . Therefore, it is possible to reduce 
the weight of N by adding some nodes that belong to C/ U P U C /2 and removing 
some other nodes that belong to P U M U P 2 . ■ 
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Abstract. We consider all-to-all routing problem in an optical ring net- 
work that uses the wavelength-division multiplexing (WDM). Since one- 
hop, all-to-all optical routing in a WDM optical ring of n nodes needs 
w (C'„,7a,1) = wavelengths which can be too large even for 

moderate values of n, we consider in this paper I’-hop implementations 
of all-to-all routing in a WDM optical ring, j > 2. /.From among the 
possible routings we focus our attention on uniform routings, in which 
each node of the ring uses the same communication pattern. We show 
that there exists a uniform 2-hop, 3-hop, and 4-hop implementation of 

all-to-all routing that needs at most \j ^ + 4 , and 

+8 wavelengths, respectively. These values are within mul- 
tiplicative constants of lower bounds. 



1 Introduction 

Optical-fiber transmission systems are expected to provide a mechanism to build 
high-bandwidth, error-free communication networks, with capacities that are or- 
ders of magnitude higher than traditional networks. The high data transmission 
rate is achieved by transmitting information through optical signals, and main- 
taining the signal in optical form during switching. Wavelength-division multi- 
plexing (or WDM for short) is one of the most commonly used approaches to 
introduce concurrency into such high-capacity networks [5,6]. In this strategy, 
the optical spectrum is divided into many different channels, each channel corre- 
sponding to a different wavelength. A switched WDM network consists of nodes 
connected by point-to-point fiber-optic links. Typically, a pair of nodes that is 
connected by a fiber-optic link is connected by a pair of optic cables. Each cable 
is used in one direction and can support a fixed number of wavelengths. The 
switches in nodes are capable of redirecting incoming streams based on wave- 
lengths. We assume that switches cannot change the wavelengths, i.e. there are 
no wavelength converters. 

Thus, a WDM optical network can be represented by a symmetric digraph, 
that is, a directed graph G with vertex set V (G) representing the nodes of the 

* The work was supported partially by a grant from NSERC, Canada. 
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network and edge set E{G) representing optical cables, such that if directed edge 
[x, y] is in E{G), then directed edge [y, x] is also in E{G). In the sequel, whenever 
we refer to an edge or a path, we mean a directed edge or a directed path. 

Different messages can use the same link (or directed edge) concurrently 
if they are assigned distinct wavelengths. However, messages assigned the same 
wavelength must be assigned edge-disjoint paths. In the graph model, each wave- 
length can be represented by a color. 

Given an optical communication network and a pattern of communications 
among the nodes, one has to design a routing i.e. a system of directed paths 
and an assignment of wavelengths to the paths in the routing so that the given 
communications can be done simultaneously. We can express the problem more 
formally as follows. 

Given a WDM optical network G, a communication request is an ordered 
pair of nodes {x,y) of G such that a message is to be sent from x to y. A 
communication instance I (or instance for short) is a collection of requests. 

Let I be an instance in G. A j-hop solution [9,10] of / is a routing i? in G 
and an assignment of colors to paths in R such that 

1. it is conflict-free, i.e., any two paths of R sharing the same directed edge 

have different colors, and 

2. for each request {x, y) in I, a directed path from x to y in R can be obtained 

by concatenation of at most j paths in R. 

Since the cost of an optical switch depends on the number of wavelengths it can 
handle, it is important to determine paths and a conflict-free color assignment 
so that the total number of colors is minimized. 

In 1-hop, usually called all optical solution of I, there is a path from a; to y in 
R for each request {x, y) in / and all communications are done in optical form. 
In a j-hop solution, j > 2, the signal must be converted into electronic form 
j — 1 times. The conversion into electronic form slows down the transmission, 
but j > 1 can significantly reduce the number of wavelengths needed [10]. 

For an instance / in a graph G, and a j-hop routing R for it, the j-hop 
wavelength index of the routing R, denoted w (G, I, R, j), is defined to be the 
minimum number of colors needed for a conflict-free assignment of colors to 
paths in the routing R. The parameter w {G,I,j), the j-hop optimal wavelength 
index for the instance / in G is the minimum value over all possible routings for 
the given instance I in G. In general, the problem of determining the optimal 
wavelength index is NP-complete [3] . 

In this paper, we are interested in ring networks. In a ring network there is a 
link from each node to two other nodes. Thus, the topology of a ring network on 
n nodes n > 3, can be represented by a symmetric directed eycle C„ (see [4] for 
any graph terminology not defined here). A symmetric directed cycle G„, n > 3, 
consists of n nodes xq, xi, . . ., Xn-i and there is an arc from Xi to a;(i+i) mod n 
and from mod n to Xi for 0 < i < n — 1, see Gg in Figure 1. We denote 

by pij a shortest path from Xi to xj. The diameter of G„, i.e., the maximum 
among the lengths of the shortest paths among nodes of G„, denoted by d„, is 
equal to . 
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Fig. 1. Ring Cg 



The all-to-all communication instance I a is the instance that contains all 
pairs of nodes of a network. I a has been studied for rings and other types of 
network topologies [1], [3], [8], [11]. It has been shown for rings in [3] that w 
(C„, 1) = [5 LxJl ■ optical switches that are available at present cannot 

handle hundreds of different wavelengths and thus the value of w (Cnj/A,!) 
can be too large even for moderately large rings. One can reduce the number of 
wavelength by considering j-hop solutions for the I a problem for j > 2. 

One possible 2-hop solution of the I a instance is the routing {po,i : 1 < * < 
n— l}U{(pi_o : 1 < * < n— 1}, in which there is a path from xq to all other nodes 
and a path from any node to xq. Any request (xi,Xj) in I a can be obtained by 
a concatenation of pi^ and poj, and we can get a conflict-free assignment of 
colors to all paths in R using colors. However, this solution has all the 

drawbacks of having one server xq for the network, i.e., a failure of xq shuts 
down the whole network and Xq is a potential performance bottleneck. This is 
very obvious if we represent R by the routing graph Gn [7] , in which there is an 
edge from a; to y if and only if there is a path in R from x to y. In case of the 
routing above, the routing graph is the star of Figure 2 a). 

For better fault-tolerance and better load distribution, we should consider 
a uniform routing [ 11 ] in which each node can communicate directly with the 
same number of nodes as any other node and at the same distance along the 
ring. More specifically, a routing i? on a ring of length n is uniform if for some 
integer m < n and some integers bi,b 2 , ■ ■ ■ ,bm the routing R consists of paths 
connecting nodes that are at distance bi, 1 < i < m along the ring, i.e. R = 
{Pi,i+bjjPi,i-bj :0<i<n — l,l<j< m} . In a uniform routing the routing 
graph is a distributed loop graph [ 2 ] of degree 2 m, i.e., for m = 2 and 61 = 1 
and 62 = 3 we get the routing graph in Figure 2 b). As seen from the figure, this 
provides much better connectivity and can give a uniform load on the nodes. 
Furthermore, the routing decisions can be the same in all nodes. 

Thus, the problem that we consider in this paper is the following: 

Given a ring Cn and integer j, j > 2, find a routing Rnj such that: 

1. Rn,j is uniform routing on C„, 

2 . Rnj is a j-hop solution of Ia, 
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a) ’=0 b) 





Fig. 2. The routing graph in Cg of a) a non-uniform, b) a uniform routing 



3. the number of colors needed for Rnj is substantially less than [ ^ . 

In Section 2, we show that there exists a uniform 2-hop routing Rn ,2 for I a 
that needs at most + f colors. We show that this is within a constant 

factor of a lower bound. 

Uniform j-hop solutions oi I a, j > 3 are considered in Section 3. We show 
that there exist a uniform 3-hop routing i?„_3 that needs at most ^ ^ l»/41-e4 

colors, and a 4-hop i?„^4 with at most y/Tfl + 8 colors. We present con- 
clusions and open questions in Section 4. Most proofs in this extended abstract 
are either omitted or only outlined. Complete proofs appear in [12]. 

2 Uniform, 2-hop All-to-All Routing 

Let n be an integer, n > 5, and C„ be a ring of length n with nodes 
Xq,Xi, . . . ,Xn-i- A routing i? is a 2-hop solution of all-to-all problem in C„, if 
any request Xi,Xj, i yf j can be obtained as a concatenation of at most two paths 
in R. Since on a ring any pair of distinct nodes is at distance between 1 and 
dn = we have that routing i? is a uniform 2-hop solution of the all-to-all 

instance on C„ if there is a set of integers B = {6i, 62 , . . . , 6m} such that R 
contains all paths on the ring of length in B, and any integer between 1 and dn 
can be obtained using at most one addition or subtraction of two elements in B. 

Lemma 1. Let k and m be positive integers, fc < and 

Bk,m = Bl^ U Bl^ U Bl^ U Bl^ where B^^ = {1,2,..., [fc/2j }, 

Bl,m = { L"^/2J , LW2J + 1, lm/2\ + 2 ,..., [m/2j +k-l}, 
ddkm — — tk + l,m — {t — l)k + I, . . . ,m — 3k + l,m — 2k + 1} where 

t = |"(m— [m/2j — 2fc-|-2)/fc], and B^^ = {m — k+ l,m — k + 2, . . . ,m — 1, m}. 
Then any integer in the set (1,2,..., 2m} can he obtained using at most one 
addition or subtraction of two elements in Bk^m- 
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Proof. It is easy to verify that by at most one addition or subtraction we can 
generate set {2m — 2k + 2, . . . , 2m} from integers in -B| 
set {2m — {t+ l)k + 2,. . 2m — 2k + 1} from and „j, 
set {m -I- [m/2j - fc -I- 1, . . . , m -I- [m/2\ -|- A: - 1}, from and 
set {m -I- [m/2j — tfc -I- 1, . . . , m -I- \ mj2\ — k}, from Bl ,^ and B\^^, 
set {2[m/2j, . . . ,2[m/2j + 2k — 2}, from B^. set {m — k + 1, . . . ,m}, from 
set [m-k- }k/2\ +l,...,m-k} from B^^ and B^^, 
set {m — tk— [A:/2J + 1, . . . ,m — 2k + [k/2\ + 1}, from B^ ^^ and Bl^^, 
set { [m/2j , . . . , [m/2j +k + [k/2\ - 1}, from Bl^ and Bl „^, 
set (m - [m/2j -2k + 2,...,m- [m/2j }, from and B^^^, 
set {k, . . . ,m — [m/2j — 2 A: -I- 2}, from Bl ^ and B^ and set (1, . . . , A;} from 
B^ Since m — tk+1 < [m/2j -|- 2A: — 1, we reach the conclusion of the lemma. 

□ 

A lower bound on the number of colors needed for a routing based on set 
Bk,m is equal to the sum of elements in Bk^m- One way to minimize the sum is 
to keep the size of the set Bk,m as small as possible. The size of the set Bk^m\s 
equal to \Bk,m\ = LfJ + ^ + _i + fc< (jn + A)/{2k). 

For an integer m, we obtain the smallest set Bk^m that generates (1,2,..., 2m} 
given in the next lemma by minimizing the value of \Bk^m\ with respect to k. 

Lemma 2. Let m be a positive integer and s{m) = ■ Then we ean gen- 

erate the set (1,2,..., 2m| by at most one operation of addition or subtraction 
on integers in the set ^st contains at most y^5(m 4) integers. 

If a node v in (7„ can communicate directly with all nodes at distance in 
Bs{m),m from V, where m = |"j], then by Lemma 2 node v can communicate in 
2-hops with every node at distance dn in C„. This gives us a way to define a 
uniform routing on (7„ which is a 2-hop solution of Ia- 

Lemma 3. Let n be an integer, n > 5, and let Rn ,2 to be the routing in Cn 
such that any node in Cn can communicate directly with all nodes at distance in 
Bs(\n/A'\)\n/f\ - Then Rn ,2 is a uniform, 2-hop solution of Ia on Cn- 

We determine an upper bound on the wavelength index of Rn ,2 by a repeated 
application of the following lemma. 

Lemma 4. Let n be an integer, n > 5, and P = {pi,p 2 , ■ ■ -Pk} be a set of 
positive integers such that Tet Ip be an instance in Cn such that 

every node in Cn communicates with all nodes at distance in P, and R be a 
routing of shortest paths. 

If Pi+ P 2 ~\ +Pk divides n then w {Cn, Ip,R,l) = pi +P 2 H +Pk- 

If {Pi+P 2 ~\ \-Pk) mod n = 1 then w {Cn,Ip, R,I) <Pi+P 2 ~\ \~Pk + I- 

If {Pi + P 2 -\ \- Pk) < m then w {Cn,Ip,R, 1) <w {Cn,Im,R, 1), where Im 

is an instance in Cn such that every node in Cn communicates with a node at 
distance m. 
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Theorem 1. For any integer n > 5 there exists a uniform routing Rn ,2 on Cn 
which is a 2-hop solution of I a such that 



^ {Cn:-hAjRn,2 



2 )< 



n + 3 
3 



16 



n 

4 



Proof. Let d = [n/2j, m = \d/2], s{m) = • We define Rn ,2 to consist, 

for every node v in C„, of paths from v to nodes at distances in set Bs{m),m- As 
specified in Lemma 1, U U U where 

= {1>2,3, [s(m)/2j}, = {TO,TO+l,TO+2,...,m+s(m)-l}, 

^s{m),m = {2m-ts(TO) + l, m- (t- l)s(m) + 1, . . . , 2m-3s(m) + l, 2m-2s(m) + l} 
and t = \{m— [m/2j — 2s{m) + 2)/s(m)] and 

^s(m) m{2^ ~ s(m) + 1, 2m — s{m) + 2, . . . , 2m — 1, 2m}. Since any distance on 
the ring can be obtained as a combination of two elements in Bs(rn),mi we have 
that i ?„_2 is a 2-hop solution of I a ■ 

We now determine an upper bound on the wavelength index of Rn, 2 - 
Assume first that n is divisible by 4. In this case m divides n. Thus, by Lemma 
4 the wavelength index for all paths of length m in is equal to m. Similarly, 
for any integer i, \ <i < [s(m) /2J the wavelength index for all paths of length i 
and m — z is equal to m, and for any integer z, 0 < z < [s(m) /3J the wavelength 
index for all paths of length [m/2j -|- i, [m/2j -|- s(m) — 1 — 2z and m — s(m) -I- 
z -I- 1 is at most 2m. This deals with all paths whose length is in two 

thirds of paths whose length is in B^^^^ and 5/6 of paths whose length is 
in In total, the number of colors needed for these paths is equal to 

m(l -I- [s(m)/2j) -I- 2([s(m)/3j). 

For any z, 1 < z < s(m)/6, all paths of length m — [s(m)/2j — z need at 
most m colors, so these paths contribute at most m([s(m)/6j to the wavelength 
index. 

For 0 < z < [s(m)/3j we group together paths of length [m/2j -|- [s(m)/3j -I- 
2z-|- 1, [m/2j -l-s(m) — 2z — 1 and m— (z-|-2)s(m) -1-1. Since [m/2j -|- [s(m) /3J -I- 
2z-|-l-|- [m/2-|-s(m) — 2z — 1-l-m— (z-|-l)s(m)-|-l < 2m— (z-l- l)s(m) -I- [s(m)/3j , 
any of these path-lengths groups needs at most 2m colors and they contribute 
at most 2m([s(m)/3j to the wavelength index. 

Any remaining path-lengths in B^^^^ ^ need at most m colors and there 

are at most s(m) — [s(m)/3j of such path-lengths. Thus, w {Cn, Ia, Rn,2,2) < 
m(l -I- [s(m)/2j -I- 2([s(m)/3j) -I- [s(m)/6j -I- 2([s(m)/3j -I- s(m) — [s(mj/3j < 

m(l -I- 8s(m) /3) < ^ + ^/4. 

If n is not divisible by 4 then m = |"[rz/2j/2] and 4(m — 1) < rz and 2(2m — 
1) < rz. Thus, in this case we group the path-lengths together similarly as above, 
except that in each group we put the path-lengths whose total is either m — 1 
or less or 2m — 1 or less. This however may require one more color per each 
group, which increases the wavelength index at most by s(m) and we get that 

w{Cn,lA,Rn,2,2)<^^^+n/^. □ 
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The theorem shows that for a ring of length n, the ratio of the number colors 
needed for 1-hop all-to-all routing over the number of wavelengths needed for 
a uniform 2-hop all-to-all routing is approximately Q.Sy/n. This can be very 
significant in some applications. 

Any set B of integers that generates the set {1,2, . . . ,n/2} by at most one 
addition or subtraction must contain at least \pnj\ integers. In order to generate 
all integers between n/4 and n/2, set B must contain at least \pnj% integers 
between n/8 and n/4 This gives us the following lower bound on the value of 

Theorem 2. w (C„,/^,2) > 

Thus, the value of w {C'n, I a, Rn, 2 , 2) is within a constant factor of the lower 
bound. 

3 Uniform j-Hop, j > 3, All-to-All Routing 

We can obtain results for uniform j-hop, j > 3, all-to-all routing using repeatedly 
the results from the previous section. A routing i? is a j-hop all-to-all routing 
in a C„, if any request Xi,xi can be obtained as a concatenation of at most j 
paths in R. Thus, i? is a uniform, j-hop, all-to-all routing on C„ if there is a set 
of integers B{j) = { 6 i, 62 , • ■ • , &m} such that R contains all paths on the ring of 
length in B{j), and any integer between 1 and [n/2j can be obtained using at 
most J — 1 operations of addition or subtraction of j elements in B{j). 

Any integer between 1 and 2m can be obtained using at most 1 operation of 
addition or subtraction of 2 elements in set = Bk,m m m 

from Lemma 1. As seen from the proof of Lemma 1, integers in set B^ ^ are not 
involved in additions or subtractions with integers from B^ ^ in order to obtain 
(1, 2, . . . , 2 to}. Since set B^ ^ is a, linear progression containing at most 
integers between m/2 and m, we can obtain all integers in this set, similarly as 
in Lemma 1, by at most one addition or subtraction from integers in sets 

B>lk where |D/ fcUZlf J < Thus, any integer between 

1 and 2m can be obtained by at most 2 additions or subtractions from integers 
in set Dk,m = U Bl^^ U U „ u U B^^ and the total size of 

Dk^m is at most L|J + ^ + \ \J + 2 -|- A: . By minimizing the value of the 
size with respect to k we obtain the next lemma. 

Lemma 5. Let m he a positive integer and r{m) = L'^ ■ Then we can 
generate the set (1,2, . . . ,2m} by at most 2 operations of addition or subtrac- 
tion from integers in the set Set Dr(m),m contains at most 4 

integers. 

Clearly, if every node in cycle Cn communicates directly with nodes at distance 
in |-„/ 41 ), then every node can communicate with any other node in at 

most 3-hops. 
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Lemma 6. Let n be an integer, n > 5, and let Rn ,3 to be the routing in Cn 
such that any node in Cn can communicate directly with all nodes at distance in 
Dr{\n/f\),\n/i~\ - Then Rn,3 is uniform, 3-hop solution of Ia on Cn- 



Theorem 3. For any integer n > 5 there exists a uniform routing Rn,s on Cn 
which is a 3-hop solution of I a such that 



W (Cn,lA,Rn,3,3) < ^ ' 



[n/4] 



Proof. Let R „.3 be the 3-hop routing from Lemma 5. We calculate the wave- 
length index similarly as in Theorem 1. For path- lengths in sets 
and B‘1 ^ we use the same methods as in Theorem 1. We thus obtain that the 
wavelength index of all these paths is at most 

[n/4](l -I- r(m)/2 -|- 2r{m)/3 -\- 2r(m)/3) = rn/4](l -|- ^ ^ • 

Since all path-lengths in sets are bounded from above by 

I , the wavelength index of all these paths is at most ^ ■ Thus the 

wavelength index of Rn ,3 is at most | ^ +4 j ^ 1 ^ 

Any integer between 1 and 2m can be obtained using at most 1 operation of 
addition or subtraction of 2 elements in set Bk^m = Bl B^ Bf. B'^ ^ 
from Lemma 1. 

Since integers in any of the sets B\^, Bl B\^, and ^ form a linear 
progression, we can obtain all integers in set 1 < * < 4, similarly as 

in Lemma 1, by at most one additions or subtractions from integers in a set 

of integers that contain at most y 5( -1-4) integers. This implies that any 

integer between 1 and 2m can be obtained by at most 3 additions or subtractions 

on sets of integers that contain at most + -I- 4 )-|- y^5(| -I- 4)4- 

Y^5(| -1-4). By substituting s(m) = ^ for k, we obtain the next lemma. 

Lemma 7. Let m be a positive integer. There exists set E^i such that the set 
{1,2, .. . ,2m} can be generated from integers in Em by at most 3 operations of 
addition or subtraction. Set Em contains at most 4({/m -1-4-1- 4) integers and 
any of its elements is less than or equal to |"m/2]. 



Clearly, if every node in cycle C„ communicates directly with nodes at distance 
in A|-„/ 41 , then every node can communicate with any other node in at most 
4-hops. 

Lemma 8. Let n be an integer, n > 5, and let Rn ,4 to be the routing in Cn 
such that any node in Cn can communicate directly with all nodes at distance in 
E\n/A']. Then Rn ,4 is a uniform, 4~hop solution of I a on Cn- 
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Theorem 4. For any integer n > 5 there exists a uniform routing Rn ,4 on Cn 
which is a 4~hop solution of I a such that 

w{Cn,IA,Rn,2,2)<^^^\-^+8 

Proof. Let Rn ,4 be the routing from Lemma 8. By Lemma 4, all cycles of length 
k, k < |"n/8] need at most |"n/8] -I- 1 colors. Thus, 

W (Cn,lA,Rn,2,2) < (rtl + l)(4^[fl+4 + 4) < 

Note that in this proof the bound on the wavelength index is calculated less 
precisely than those for 2-hop and 3-hop case. □ 

We can show, similarly as in the 2-hop case that the result above is within a 
constant factor of a lower bound. 

Clearly, the process that we used for deriving the wavelength index of 3-hop 
and 4-hop wavelength indices can be extended for higher number of hops. 

4 Conclusions 

We gave an upper bound on the wavelength index of a uniform j-hop all-to-all 
communication instance in a ring of length n for 2 < j < 4, which is within 
a multiplicative constant of a lower bound. The results show that there is a 
large reduction in the value of the wavelength index when going from a 1-hop 
to a 2-hop routing, since we replace one factor of n by -^71. However, the rate 
of reduction diminishes for subsequent number of hops, since we only replace 
one factor of ^/n by ^/n when going from a 2-hop to a 3-hop routing, etc. For 
example, for a cycle on 100 nodes we get 

w (Cioo,/a,1) = 1250, w (Cioo,/a,2) < 165, w (C'ioo,/a,3) < 115. 

The value of the upper bounds on the wavelength index that we obtained 
depends on the size of the set Bs(\n/ 4 ']),\n/ 4 '\ from which we can generate the set 
{1,2, ... , [(n — 1)/2J} using at most one operation of addition or subtraction. 
Obviously, if we obtain an improvement on the size of Bg(^^n/ 4 '\ ),\n/ 4 '\} we could 
improve the upper bounds on value of the wavelength index of a uniform j- 
hop all-to-all communication instance, j > 2. This seems to be an interesting 
combinatorial problem that, to the best of our knowledge, has not been studied 
previously. It would be equally interesting to get a better lower hound on the size 
of a set that generates set (1,2,..., [{n — 1) /2J } using at most one operation of 
addition or subtraction. Thus we propose the following open problems: 

Open Problem 1: Find a tight lower bound on the size of a set of integers Bf^ 
such that any integer in the set {l,2,...,m} can he obtained by at most one 
operation of addition or subtraction from integers in Bf^ . 

Open Problem 2: Find a set of integers Bf^ such that any integer in the set 
( 1 , 2 ,..., m} can he obtained by at most one operation of addition or subtraction 
from integers in B^^ and which is smaller in size than the set given in this paper. 
Similar open problems can be asked for a higher number of operation. 
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Answer to open problems 2 does not yet solve the wavelength index of a 
uniform j-hop all-to-all communication instance in rings, as it is necessary to 
devise a coloring of the paths in the ring. As of now, there is no general algorithm 
that can give a good color assignment to paths in case of a uniform instance on 
a ring. This leads us to propose the following open problem: 

Open Problem 3: Give an algorithm that, given a uniform instance I in C„, 
finds a good approximation of w 1). 
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Abstract. We propose a fully-dynamic distributed algorithm for the 
all-pairs shortest paths problem on general networks with positive real 
edge weights. If Aa- is the number of pairs of nodes changing the dis- 
tance after a single edge modification a {insert, delete, weight-decrease, 
or weight-increase) then the message complexity of the proposed algo- 
rithm is 0{nAa) in the worst case, where n is the number of nodes of 
the network. If A„- = o{n^), this is better than recomputing everything 
from scratch after each edge modification. 



1 Introduction 

The importance of finding shortest paths in graphs is motivated by the numerous 
theoretical and practical applications known in various fields as, for instance, in 
combinatorial optimization and in communication networks (e.g., see [1,10]). We 
consider the distributed all-pairs shortest paths problem, which is crucial when 
processors in a network need to route messages with the minimum cost. 

The problem of updating shortest paths in a dynamic distributed environ- 
ment arises naturally in practical applications. For instance, the OSPF protocol, 
widely used in Internet (e.g., see [9,13]), updates the routing tables of the nodes 
after a change to the network, by using a distributed version of Dijkstra’s algo- 
rithm. In this and many other crucial applications the worst case complexity of 
the adopted protocols is never better than recomputing the shortest paths from 
scratch. Therefore, it is important to find distributed algorithms for shortest 
paths that do not recompute everything from scratch after each change to the 
network, because this could result very expensive in practice. 

If the topology of a network is represented as a graph, where nodes represent 
processors and edges represent links between processors, then the typical update 
operations on a dynamic network can be modeled as insertions and deletions of 
edges and update operations on the weights of edges. When arbitrary sequences 
of the above operations are allowed, we refer to the fully dynamic problem; if 
only insertions and weight decreases (deletions and weight increases) of edges 
are allowed, then we refer to the incremental {decremental) problem. 
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Previous works and motivations. Many solutions have been proposed in 
the literature to find and update shortest paths in the sequential case on graphs 
with non-negative real edge weights (e.g., see [1,10] for a wide variety). The state 
of the art is that no efficient fully dynamic solution is known for general graphs 
that is faster than recomputing everything from scratch after each update, both 
for single-source and all-pairs shortest paths. Actually, only output bounded fully 
dynamic solutions are known on general graphs [4,11]. 

Some attempts have been made also in the distributed case [3,5,7,12]. In 
this field the efficiency of an algorithm is evaluated in terms of message, time 
and space complexity as follows. The message complexity of a distributed algo- 
rithm is the total number of messages sent over the edges. We assume that each 
message contains 0(log n + R) bits, where R is the number of bits available to 
represent a real edge weight, and n is the number of nodes in the network. In 
practical applications messages of this kind are considered of “constant” size. 
The time complexity is the total (normalized) time elapsed from a change. The 
space complexity is the space usage per node. 

In [5], an algorithm is given for computing all-pairs shortest paths requir- 
ing O(n^) messages, each of size n. In [7], an efficient incremental solution has 
been proposed for the distributed all-pairs shortest paths problem, requiring 
0{nlog(nW)) amortized number of messages over a sequence of edge insertions 
and edge weight decreases. Here, W is the largest positive integer edge weight. 
In [3], Awerbuch et al. propose a general technique that allows to update the 
all-pairs shortest paths in a distributed network in 0(n) amortized number of 
messages and 0(n) time, by using O(n^) space per node. In [12], Ramarao and 
Venkatesan give a solution for updating all-pairs shortest paths that requires 
O(n^) messages and time and 0(n) space. They also show that, in the worst 
case, the problem of updating shortest paths is as difficult as computing short- 
est paths. 

The results in [12] have a remarkable consequence. They suggest that two 
possible directions can be investigated in order to devise efficient fully dynamic 
algorithms for updating all-pairs shortest paths: i) to study the trade-off between 
the message, time and space complexity for each kind of dynamic change; ii) to 
devise algorithms that are efficient in different complexity models (with respect 
to worst case and amortized analyses). 

Concerning the first direction, in [7] an efficient incremental solution has been 
provided, and the difficulty of dealing with edge deletions has been addressed. 
This difficulty arises also in the sequential case (see for example [2]). 

In this paper, the second direction is investigated. We observed that the 
output complexity [4,10] was a good candidate (it is a robust measure of per- 
formance for dynamic algorithms in the sequential case [4,10,11]). This notion 
applies when the algorithms operate within a framework where explicit updates 
are required on a given data structure. In such a framework, output complexity 
allows to evaluate the cost of dynamic algorithms in terms of the number of up- 
dates to the output information of the problem that are needed after any input 
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update. Here we show the merits of this model also in the field of distributed 
computation, improving over the results in [12]. 

Results of the paper. The novelty of this paper is a new efficient and prac- 
tical solution for the fully dynamic distributed all-pairs shortest paths problem. 
To the best of our knowledge, the proposed algorithm represents the first fully 
dynamic distributed algorithm whose message complexity compares favorably 
with respect to recomputing everything from scratch after each edge modifi- 
cation. This result is achieved by explicitly devising an algorithm whose main 
purpose is to minimize the cost of each output update. 

We use the following model. Given an input change a and a source node s, let 
Scr,s be the set of nodes changing either the distance or the parent in the shortest 
paths tree rooted at s as a consequence of a. Furthermore, let = Usgy(5CT,s and 
Aa = evaluate the message and time complexity as a function 

of Aa-. Intuitively, this parameter represents a lower bound to the number of 
messages of constant size to be sent over the network after the input change a. 
In fact, if the distance from u to v changes due to cr, then at least u and v have 
to be informed about the change. 

We design an algorithm that updates only the distances and the shortest 
paths that actually change after an edge modification. In particular, if maxdeg 
is the maximum degree of the nodes in the network, then we propose a fully 
dynamic algorithm for the distributed all-pairs shortest paths problem requiring 
in the worst case: 0{maxdeg ■ A„) messages and O(A^) time for insert and 
weight-decrease operations; 0{Toax.{\5^\, maxdeg} • Afi) messages and time for 
delete and weight-increase operations. The space complexity is 0{n) per node. 
The given bounds compare favourably with respect to the results of [12] when 
Z\ct = o(n^), and it is only a factor (bounded by max{j<5cr|, maxdeg} in the worst 
case) far from the optimal one, that is the (hypothetical) algorithm that sends 
over the network a number of messages equal to the number of pairs of nodes 
affected by an edge modification. 

2 Network Model and Notation 

We consider point-to-point communication networks, where a processor can gen- 
erate a single message at a time and send it to all its neighbors in one time step. 
Messages are delivered to their respective destinations within a finite delay, but 
they might be delivered out of order. The distributed algorithms presented in 
this paper allow communications only between neighbors. We assume an asyn- 
chronous message passing system; that is, a sender of a message does not wait 
for the receiver to be ready to receive the message. In a dynamic network when 
a modification occurs concerning an edge (u,v), we assume that only nodes u 
and V are able to detect the change. Furthermore, we do not allow changes to 
the network that occur while the proposed algorithm is executed. 

We represent a computer network, where computers are connected by com- 
munication links, by an undirected weighted graph G = (V,E,w), where M is a 
finite set of n nodes, one for each computer; if is a finite set of m edges, one 
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for each link; and re is a weight function from E to positive real numbers. The 
weight of the edge {u,v) € E is denoted as w{u,v). For each node u, N{u) con- 
tains the neighbors of u. A path between two nodes u and w is a finite sequence 
p = {u = vq,vi, . . . ,Vk = v) of distinct nodes such that, for each 0 < i < k, 
(vi,Vi+i) G E, and the weight of the path is weight{p) = X)o<i<fc 
The distance d{u, v) between any pair of nodes u and v is the minimum weight 
of all possible paths connecting u to v. A shortest path from m to r; is defined as 
any path p such that weight{p) = d{u,v). If s € is an arbitrary source node, 
we denote as Tg a shortest paths tree of G rooted at s; for any u € V, Ts{u) 
denotes the subtree of Tg rooted at u. We assume that each node u knows: i) 
the identities of all nodes, 1, 2, . . . , n; ii) the identity of each node in N{u)] Hi) 
for each Ui G N{u), the edge connecting u to Ui, and the weight w{u,Ui). 



3 The Fully-Dynamic Algorithm 

We describe the algorithms handling weight- decrease and weight-increase op- 
erations, being straightforward the extension to insert and delete operations, 
respectively. 

We use the following data structures. A routing table RT[, •], needed to store 
the information on the all-pairs shortest paths. Each node u in G, maintains 
only the set of records RT[u, •], one record RT[u, w] for each possible destination 
V € E\{m}. Each record has two fields: RT[u, v]. weight, and RT[u,v].via, where 
weight is the distance between u and v, and via is the neighbor of u in the 
path used to determine the weight. In the following, each subcomponent of the 
routing table RT[u, v]. field will be also denoted as field{u, v). The space required 
to store the routing table is clearly 0{n) per node. 

For each v G V, d'{s,v) denotes the distance from s to r: in the graph G' 
obtained from G after an edge modification (in general, we denote by 7 ' any 
parameter 7 after an edge modification). 

After an edge modification, for each source s, the proposed procedures cor- 
rectly update weight{v,s) as d'{v,s), and via{v,s) as the neighbor of v in the 
path used to determine weight{v, s) in G'. Notice that, the procedures implicitly 
maintain a shortest paths tree Tg for each source s; Tg is the tree induced by the 
set of edges (w, via(u, s)), for each node u reachable from s. 

Both for weight-decrease and for weight-increase operations, we describe the 
behavior of the algorithm with respect to a fixed source s. To obtain the algo- 
rithm for updating all-pairs shortest paths, it is sufficient to apply the algorithm 
with respect to all the possible sources. 

3.1 Decreasing the Weight of an Edge 

Suppose that a weight decrease operation a is performed on edge (x,y), that is, 
w'{x,y) = w{x,y) — e, e > 0. In this case, if d(s,x) = d(s,y), then Sa-,g = 0, and 
no recomputation is needed. Otherwise, without loss of generality, we assume 
that d(s, x) < d(s, y). In this case, if d'{s, y) < d{s, y) then all nodes that belong 
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to Ts{y) are in On the other hand, there could exist nodes not contained in 
Ts{y) that belong to Sa-,s- In any case, every node in 5a-, s decreases its distance 
from s as a consequence of cr. 

The algorithm shown in Fig. 1 is based on the following property: if u G Sa,s, 
then there exists a shortest path connecting v to s in G' that contains the path 
from V to y in Ty as subpath. This implies that d'{v, s) can be computed as 
d'(u, s) = d{v, y) + w'{x, y) + d{x, s). 



Node V receives “weight{u, s)” from u. 

1. if via(v,y) = u then 

2. begin 

3. if weight{v, s) > w(v, u) + weight{u, s) then 

4. begin 

5 . weight{v, s) := w{v, u) + weightiu, s) 

6 . via(v, s) := u 

7. for each Vi G N{v) \ {«} do send “weight{v, s)” to Vi 

8 . end 

9. end 



Fig. 1. The decreasing algorithm of node v. 



Based on this property, the algorithm performs a visit of Ty starting from 
y. This visit finds all the nodes in 6a, s and updates their routing tables. Each 
of the visited nodes v performs the algorithm of Fig. 1. When v figures out 
that it belongs to 6a, s (line 3), it sends “weight{v, sy^ to all its neighbors. This 
is required because v does not know its children in Ty (since y is arbitrary, 
maintaining this information would require 0{n?) space per node). Only when a 
node, that has received the message “weight{u, from a neighbor u, performs 
line 1, it figures out whether it is child of a node in Ty. 

Notice that the algorithm of Fig. 1 is performed by every node v distinct 
from y. The algorithm for y is slightly different: (i) y starts the algorithm when 
it receives the message "" w eight {u, s)" from u = x. This message is sent to y as 
soon as x detects the decrease on edge (x, y); (ii) y does not perform the test of 
line 1; (Hi) the weight w{v,u) at lines 3 and 5 coincides with w'{x,y). 

Theorem 1. Updating all-pairs shortest paths over a distributed network with 
n nodes and positive real edge weights, after a weight-decrease or an insert op- 
eration, requires 0{maxdeg ■ Aa) messages, 0{Aa) time, and 0{n) space. 



3.2 Increasing the Weight of an Edge 

Suppose that a weight increase operation a is performed on edge (x,y), that is, 
w'(x, y) = w{x, y) + e, e > 0. In order to distinguish the set of required updates 
determined by the operation, we borrow from [4] the idea of coloring the nodes 
with respect to s, as follows: 
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• color{q, s) = white if q changes neither the distance from s nor the parent in 
Ts (i.e., weight' (q,s) = weight{q,s) and via'{q,s) = ma(( 7 , s)); 

• color{q, s) = pink if q preserves its distance from s, but it must replace the 
old parent in Tg (i.e., weight' {q, s) = weight{q, s) and via'{q,s) ^ via{q,s)); 

• color{q,s) = red if q increases the distance from s (i.e., weight' {q,s) > 
weight {q, s)). 

According to this coloring, the nodes in 6a-^g are exactly the red and pink 
nodes. Without loss of generality, let us assume that d{s,x) < d{s,y). In this 
case it is easy to see that if v ^ Ts{y), then v ^ Sa,s- In other words, all the red 
and pink nodes belong to Tg{y). 

Initially all nodes are white. If v is pink or red, then either v is child of a red 
node in Tg{y), or v = y.li v is red then the children of v in Tg{y) will be either 
pink or red. If v is pink or white then the other nodes in Tg(v) are white. 

By the above discussion, if we want to bound the number of messages deliv- 
ered over the network, to update the shortest paths from s as a function of the 
number of output updates, then we cannot search the whole Tg(y). In fact, if 
Tg(y) contains a pink node v, then the nodes in Tg(y) remain white and do not 
require any update. For each red or pink node v, we use the following notation: 

• APs (v) denotes the set of alternative parents of v with respect to s, that is, a 
neighbor q of v belongs to APs(w) when d{s, q) + w{q, v) = d{s, v). 

• BNRs(u) denotes the best non-red neighbor of v with respect to s, that is, a 
non-red neighbor q of v, such that the quantity d{s, q) w{q, v) is minimum. 

If APg(u) is empty and bnRs(u) exists, then bnRs(u) represents the best way 
for V to reach s in G' by means of a path that does not contain red nodes. 

The algorithm that we propose for handling weight-increase operations con- 
sists of three phases, namely the Coloring, the Boundarization, the Recomputing 
phase. In the following we describe in detail these three phases. We just state 
in advance that the coloring phase does not perform any update to RT[-,s]. A 
pink node v updates via{v, s) during the boundarization phase, whereas a red 
node V updates both weight{v, s) and via{v, s) during the recomputing phase. 

Coloring phase. During this phase each node in Tg(y) decides its color. At the 
beginning all these nodes are white. The pink and red nodes are found starting 
from y and performing a pruned search of Tg (y) . The coloring phase of a generic 
node V is given in Fig. 2. Before describing the algorithm in detail, we remark 
that it works under the following assumptions. 

Al. If a node v receives a request for weight{v, s) and color{v, s) from a neighbor 
(line 7), then it answers immediately. 

A2. If a red node v receives the message “color{z, s) = red' from z G N{v), then 
it immediately sends “end-coloring” to z (see line 1 for red nodes). 

When V receives ^^color{z, s) = red' from z, it understands that has to decide its 
color. The behavior of v depends on its current color. Three cases may arise: 
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The red node v receives the message “color{z, s) = red' from 2 ; G N{v). 

1. send to 2 the message “end-coloring”; HALT 

The non-red node v receives the message “color{z,s) = red’ from 2 G N{v). 

1 . if color(v,s) = white then 

2 . begin 

3. if 2 via(v,s) then send to 2 the message “end-coloring”; HALT 

4. Ap4«):=0 

5. for each Vi G N{v) \ { 2 } do 

6 . begin 

7. ask Vi for weight{vi, s) and color{vi,s) 

8. if color{vi, s) ^ red and weight{v, s) = w(v, Vi) weight{vi, s) 

9. then APs(w) := APs(w) U {ui} 

10 . end 

11 . end 

12. if 2 G APs(u) then APs(u) := APs(u) \ { 2 } 

13. if APs(u) 7 ^ 0 

14. then color(v, s) ;= pink 

15. else begin 

16. color(v,s) red 

17. for each Vi (z N(v) \ { 2 } send to Vi the message “color(v, s) = red’ 

18. for each Vi G N{v)\ { 2 } wait from Vi the message “end-coloring” 

19. end 

20. send to 2 the message “end-coloring”; 

Fig. 2. The coloring phase of node v 



1. V is white: In this case, v tests whether z is its parent in Ts{y) or not. If 
z yf via{v, s) (line 3), then the color of v remains white and v communicates 
to z the end of its coloring. If z = via(v, s), then v finds all the alternative 
parents with respect to s, and records them into APs{v) (lines 4-10). If 
APg{v) yf 0 (line 13), then v sets its color to pink (line 14) and communicates 
to z the end of its coloring (line 20). If APs{v) = 0 (line 15), then v does 
the following: i) sets its color to red (line 16); ii) propagates the message 
“color{v, s) = red’ to each neighbor but z (line 17); Hi) waits for the message 
“end-coloring” from each of these neighbors (line 18); iv) communicates the 
end of its coloring phase to z (line 20) . 



2. V is pink: In this case, the test at line 12 is the first action performed 
by V. If z is an alternative parent of v, then z is removed from APs(ri) 
(since z is now red). After this removing, v performs the test at line 13: if 
there are still elements in APs(w), then v remains pink and sends to z the 
message concerning the end of its coloring phase (lines 14 and 20); otherwise, 
V becomes red and propagates the coloring phase to its neighbors (lines 15- 
19), as already described in case 1 above. 
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3. V is red: In this case, v performs a different procedure: it simply commu- 
nicates to z the end of its coloring phase (see line 1 for red nodes). This is 
done to guarantee that Assumption A2 holds. 

According to this strategy, at the end of the coloring phase node y is aware 
that each node in Tg{y) has been correctly colored. The algorithm of Fig. 2 is 
performed by every node distinct from y. The algorithm for y is slightly different. 
In particular, at line 20, y does not send “end-coloring” to z = x. Instead, y starts 
the boundarization phase described below and shown in Fig. 3. 



Node V receives the message “start boundarization(e)” from z = via(v,s). 

1. if color {v, s) = pink then 

2. begin 

3. via(v, s) := q, where q is an arbitrary node in APs(w) 

4. color{v, s) := white; HALT 

5. end 

6. if color(v, s) = red then 

7. begin 

8. £v '■= weight {v, s) + e 

9. BNRs(u) := nil 

10. PINK-CHILDRENs(n) := 0; RED-CHILDRENs(li) 0 

11. for each Vi G N{v) \ {z} do 

12. begin 

13. V asks Vi for weight{vi, s), via{vi,s), and color{vi,s) 

14. if color{vi, s) ^ red and A > w(v, Vi) -|- weight{vi, s) then 

15. begin 

16. £v := w{v, Vi) + weight{vi, s) 

17. bnR3(u) := Vi 

18. end 

19. if color{vi, s) = pink and via{vi,y) = v 

20. then pink-childreNs(ii) := pink-childreNs(w) U {wi} 

21. if colorivi, s) = red and via(vi,y) = v 

22. then red-childreNs(w) := red-childreNs(v) U {wi} 

23. end 

24. if BNRs(w) = nil 

25. then Bs{v) ~ 0 {n is not boundary for s} 

26. else Bs(v) := {{v;£v)} {w is boundary for s} 

27. for each Vi G pink-childreNs(i>) U red-childreNs(ii) 

28. do send “start boundarization(e)” to Vi 

29. for each Vi G red-childreNs(u) do 

30. begin 

31. wait the message “Ba(wi)” from Vi 

32. Bs(v) := Bs{v) U Bs{vi) 

33. end 

34. send “Bs{v)” to via{v,s) 

35. end 



Fig. 3. The boundarization phase of node v 
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Boundarization phase. During this phase, for each red and pink node v, 
a path (not necessarily the shortest one) from u to s is found. Note that, as 
in Assumption A1 of Coloring, if a node v receives a request for weight{v, s), 
via(v,s) and color{v,s) from a neighbor (line 13), then it answers immediately. 

When a pink node v receives the message “start boundarization(e)” ((e) is 
the increment of w{x,y)) from via(v,s), it understands that the coloring phase 
is terminated; at this point v needs only to choose arbitrarily via{v, s) among 
the nodes in APg(u), and to set its color to white (lines 2-5). 

When a red node v receives the message “start boundarization(e)” from 
via(v, s), it has no alternative parent with respect to s. At this point, v computes 
the shortest between the old path from u to s (whose weight is now increased 
by e (line 8)), and the path from u to s via BNRg(u) (if any). If BNRg(u) ex- 
ists, then V can reach s through a path containing no red nodes. In order to 
find BNRs(u), V asks every neighbor Vi for weight{vi, s), via(vi, s) and color{vi, s) 
(see lines 12-23). At the same time, using color{vi,s) and via{vi,s), v finds its 
pink and red children in Ts{y) and records them into pink-CHILDRENs(u) and 
RED-CHiLDRENs(u) (see lines 20 and 22). 

If BNRs(u) exists and the path from v to s via bnRs(w) is shorter than 
weight{v, s) + e, then v is called boundary for s. In this case, v initializes Bs{v) 
as {(u; £„)} (line 26), where iy is the weight of the path from u to s via bnRs(v). 

When V terminates the boundarization phase, the set Bs{v) contains all the 
pairs {z]£z) such that z G Ts{v) is a boundary node. In fact, at line 28 v 
sends to each node Vi € pink-childreNs(v) U red-childreNs(v) the value 
e (to propagate the boundarization), and then waits to receive Bs{vi) from 
Vi G RED-CHiLDRENs(u) (lines 30-33). Notice that v does not wait for any mes- 
sage from a pink children Vi G pink-CHILDRENs(u), because Bg{vi) is empty. 
Whenever v receives Bs{vi) from a child Vi, it updates Bs{v) as Bs{v) U Bs{vi) 
(line 32). Finally, at line 34, v sends Bs{v) to y via via{v,s). 

At the end of the boundarization phase, the set Bs{y), containing all the 
boundary nodes for s, has been computed and stored in y. Notice that, the 
algorithm of Fig. 3 is performed by every node distinct from y. The algorithm 
for y is slightly different. In particular, at line 34, y does not send ‘"Bsiyf^ to 
via{y,s). Instead, y uses this information to start recomputing phase. In the 
recomputing phase, y broadcasts through Ts{y) the set Bs{y) to each red node. 

Recomputing phase. In this phase, each red node v computes weight' {v, s) 
and via'{v,s). The recomputing phase of a red node v is shown in Fig. 4, and 
described in what follows. Let us suppose that the red node v has received the 
message “Bs{yy^ from via{v,y). 

Concerning the shortest path from u to s in G' two cases may arise: a) it 
coincides with the shortest path from u to s in G; b) it passes through a boundary 
node. In case b) two subcases are possible: 61) the shortest path from w to s in 
G' passes through BNRg(u); 62) the shortest path from u to s in G' contains a 
boundary node different from v. 

Node V performs the following local computation: for each 6 G Ba(y), it 
computes Wmin as \m-a.i,{weight{v ,b) + tb\ (hne 1), and bmin as the boundary 
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node such that Wmin = weight {v,bmin) + ^&mtn (line 2). After v has updated 
weight{v, s) as weight{v, s)+e (line 3), v compares weight{v, s) (the new weight of 
the old path from v to s) with Wmin (line 4) and correctly computes weight' {v, s) 
(line 6), according to cases a) and b) above. At lines 7-9, via{v,s) is computed 
according to cases 61) and 62) above. Finally, by using the information contained 
in RED-CHiLDRENs(u), V propagates Bs{y) to the red nodes in Ts{v). 



The node v receives “Bs{y)” from via(v,y). 

1. Wmin := rmn{weight{v,b) +4 | (6;4) £ Bs{y)} 

2. let bmin be a node snch that Wmin = weight{v, bmin) + ib^in 

3. weight{v, s) := weight{v, s) + e 

4. if weight{v, s) > Wmin then 

5. begin 

6. weight{v, s) := Wmin 

7 . if bmin — V 

8. then via{v,s) := bnRs(w) 

9. else via{v,s) := via{v,bmin) 

10. end 

11. color{v, s) := white 

12. for each Vi G RED-CHlLDRENs(n) do send ‘^Bs^y)” to Vi 

Fig. 4. The recomputing phase of node v 



It is easy to show that each phase is deadlock free. 

Theorem 2. Updating all-pairs shortest paths over a distributed network with n 
nodes and positive real edge weights, after a weight-increase or a delete operation, 
requires 0{\nax.{\5a\,maxdeg'\ ■ A„) messages, 0{meix.{\6„\,maxdeg'\ ■ A^) time, 
and 0{n) space. 
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Abstract. Integer factorization and discrete logarithms have been known 
for a long time as fundamental problems of compntational number the- 
ory. The invention of public key cryptography in the 1970s then led to 
a dramatic increase in their perceived importance. Currently the only 
widely used and trusted public key cryptosystems rely for their pre- 
sumed security on the difficulty of these two problems. This makes the 
complexity of these problems of interest to the wide public, and not just 
to specialists. 

This lecture will present a survey of the state of the art in integer factor- 
ization and discrete logarithms. Special attention will be devoted to the 
rate of progress in both hardware and algorithms. Over the last quarter 
century, these two factors have contributed about equally to the progress 
that has been made, and each has stimulated the other. Some projections 
for the future will also be made. 

Most of the material covered in the lecture is available in the survey 
papers [1,2] and the references listed there. 
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Abstract. Let p be a prime and let p be a primitive root of the field IFp 
of p elements. In the paper we show that the communication complexity 
of the last bit of the Diffie-Hellman key , is at least n/24 + o(n) 
where x and y are n-bit integers where n is defined by the inequalities 
2" < p < 2""*"^ — 1. We also obtain a nontrivial upper bound on the 
Fourier coefficients of the last bit of . The results are based on some 
new bounds of exponential sums with g^^ . 



1 Introduction 

Let p be a prime and let IFp be a finite field of p elements which we identify with 
the set {0, . . . ,p— 1}. We define integer n by the inequalities 2” < p < 2”+^ — 1 
and denote by Bn the set of n-bit integers, 

Bn = {x&2Z : 0 < X < 2" - 1}. 

Throughout the paper we do not distinguish between n-bit integers x € B„ 
and their binary expansions. Thus Bn can be considered as the n-dimensional 
Boolean cube Bn = {0, 1}” as well. 

Finally, we recall the notion of communication complexity. Given a Boolean 
function f{x,y) of 2n variables 

X = {xi,. . . ,Xn) & Bn and p = (j/i, ■ ■ ■ , p«) G ^n, 

we assume that there are two collaborating parties and the value of x is known 
to one party and the values of y is known to the other, however each party has no 
information about the values of the other. The goal is to create a communication 
protocol P such that, for any inputs x,y G Bn, at the end at least one party 
can compute the value of f{x,y). The largest number of bits to exchange by a 
protocol P, taken over all possible inputs x,y G Bn, is called the communication 
complexity G(P) of this protocol. The smallest possible value of C'(P), taken 
over all possible protocols, is called the communication complexity C(/) of the 
function /, see [2,21]. 
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Given two integers x,y & Bn, the corresponding Diffie-Hellman key is de- 
fined as . Studying various complexity characteristics of this function is of 
primal interest for cryptography and complexity theory. Several lower bounds 
on the various complexity characteristics of this function as well as the discrete 
logarithm have been obtained in [30]. In particular, for a primitive root g of IFp, 
one can consider the Boolean function f(x,y) which is defined as the rightmost 
bit of g^y, that is, 



f{xi, ...,Xn,yi,.. -,yn) 



l,if5"^G{l,3,...,p-2}; 
0,if g-y 



( 1 ) 



In many cases the complexity lower bounds of [30] are as strong as the best 
known lower bounds for any other function. However, the lower bound C(/) > 
log 2 n-l-O(l) of Theorem 9.4 of [30] is quite weak. Here, using a different method, 
we derive the linear lower bound C(/) > n/24 -|- o(n) on the communication 
complexity of /. 



We also use the same method to obtain an upper bound of the Fourier coef- 
ficients of this function, that is. 



f{u,v) = 2-2” Y 

XG Bn y& Bn 

where u,v € B„ and (wz) denotes the dot product of the vectors w, z € Bn- This 
bound can be combined with many known relations between Fourier coefficients 
and various complexity characteristics such as such as the circuit complexity, the 
average sensitivity, the formula size, the average decision tree depth, the degrees 
of exact and approximate polynomial representations over the reals and several 
others, see [3,4,8,15,22,23,27] and references therein. 

We remark, that although these results do not seem to have any crypto- 
graphic implications it is still interesting to study complexity characteristics of 
such an attractive number theoretic function. Various complexity lower bounds 
for Boolean functions associated with other natural number theoretic problems 
can be found in [1,5,6,7,14,30]. 

Our main tool is exponential sums including a new upper bound of double 
exponential sums 

5a(T, V)= E 

xe X yey 

where e{z) = exp{2ni/p), with a G Fp and arbitrary sets fb, V C Bn- These 
sums are of independent number theoretic interest. In particular they can be 
considered as generalizations of the well known sums 

H 

V) = E E Qo.{H) = Y^i.o-f) ^ 

U v^V x—1 

where a G Fp, 1 < H < t, and U, V Q Fp, which are well known in the literature 
and have proved to be useful for many applications, see [12,17,28,29] as well as 
Problem 14.a to Chapter 6 of [31] for Ta{U, V) and [18,19,20,24,25] for Qa{H)- 
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In this paper we estimate sums S'a(A’, for arbitrary sets X and y. Pro- 
vided that both sets are of the same cardinality \ X\ = | = N our estimates 

are nontrivial for N > pi5/i6+<5 any fixed <5 > 0. 

We remark that the distribution of the triples of for x,y G Bn 

has been studied in [9,10], see also [30]. In fact this paper relies on an estimate 
of some double exponential sums from [9] . 

Throughout the paper the implied constants in symbols ‘O’, ‘<C’ and are 
absolute (we recall that A B and B ^ A are equivalent to A = 0{B)). 



2 Preparations 



We say that a set SC Bn is a cylinder if there is a set J C {1, . . . ,n} such 
that the membership x € S does not depend on components Xj, j € J , of 
X = {x\, . . . , Xn) G Bn- The discrepancy A{f) of / is defined as 

A{f) = 2-2"inax|lVi(5, - IVo(5, T)], 

, / 

where the maximum is taken over all cylinders S, B C Bn and S, B) is the 
number of pairs (x,y) € S x T with f{x,y) = y,. 

The link between the discrepancy and communication complexity is provided 
by the following statement which is a partial case of Lemma 2.2 from [2]. 

Lemma 1. The bound 

c(f ) > log, (^) 

holds. 



We use exponential sums to estimate the discrepancy of the function (1). 
The following statement has been proved in [9], see the proof of Theorem 8 of 
that paper. 



Lemma 2. Let A G Fp he of multiplicative order t. For any a,b € F*, the hound 



E 



t 



4 



^e(aA’'-k6A™) 

V — 1 






holds. 



We recall the well known fact, see see Theorem 5.2 of Chapter 1 of [26], that 
for any integer m > 2 the number of integer divisors r(m) of m satisfies the 
bound 



log 2 r(m) < (1 -k o(l)) 



In m 
Inlnm 



(2) 
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We now apply Lemma 2 to estimate Sa{ X, for arbitrary sets X, y € Bn- 

Lemma 3. The bound 

max \Sa{ X,y)\<s:\ j;|5/6p5/8^(p _ 

ogf; 



holds. 

Proof. For a divisor d\p— 1 we denote by y{d) the subset ofy £ y with gcd(y, p— 
1) = d. Then 

|^a(A’,3^)|< ^ la^l, 

d\p-l 

where 

<^d=Yl XI ® • 

xeXy^y(d) 

Using the Cauchy inequality, we derive 









2 


i<^i X 




<\yY. 


X 


XG AT 


y^y{d) 

p-1 


x—1 


yey(d) 



= |T| ^ Y.e{a{g^y-gn)- 



y,z€y{d) x=l 

By the Holder inequality we have 



\auf<\Xf\y{d)f 

yxey(d) 

{p-l)/d 

<\xf\y{d)\^ X X 

y^y{d) u^l 



P-1 

J2e{a{g^y-gn) 

X—1 



P-1 



^e(a 



Because each element y £ y(d) can be represented in the form y = dv with 
gcd(n, (p — l)/d) = 1 and \d = g‘^ is of multiplicative order (p — l)/d, we see 
that the double sum over u and x does not depend on y. Therefore, 



(p-l)/d 



Wd\^<\x\*\y{dy 



lL — 1 



(p-l)/d 

= |Tnj;(d)|V ^ 



p-1 

^e(a(A^-An) 

X—1 

(p-l)/d 

^ e (a (AS -AD) 



4 
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By Lemma 2 we obtain 

/ — 1 \ 

|A’|4|3;(d)|Vp(^^j < |<¥|4|3;(ci)|V4/3di/3. ( 3 ) 

Using the bound |3^(d)| < |3^| for d < p/|3^| and the bound |3^(d)| < p/d for 
d > p/\ 3^1, we see that 

Wd\ < 

for any divisor d\p — 1 and the desired result follows. □ 

Finally, to apply Lemma 3 to the discrepancy we need the following two 
well known statements which are Problems 11. a and 11. c to Chapter 3 of [31], 
respectively. 



Lemma 4. For any integers u and m >2, 



m— 1 
A-0 



0, ifu^O (mod m); 
TO, if u = 0 (mod to) . 



Lemma 5. For any integers FI and to > 2, 



m— 1 



E 



H 

y^e^(az) 

2=0 



0(TOlnTO.). 



3 Communication Complexity and Fourier 
Coefficients of the Diffie— Heilman Key 



Now we can prove our main results. 

Theorem 1. For the eommunication complexity of the function f{x,y) given 
by (1), the hound 

C(/) > ^n + o{n) 

holds. 

Proof. We can assume that p > 3. Put F[ = (p — l)/2. Then for any sets 
S , T ^ Bn (not necessary cylinders). Lemma 4 implies that 

«o(s.ri = )i;E 

a— 0 S y^T z—1 
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Separating the term | 5|| T\H/p, corresponding to a = 0, we obtain 



No{S,T)- 



\S\\T\H 



1 



p-i 



<-J2\Sa{s,r)\ 



H 



^e(-2az) 



2 = 1 



Using Lemma 3 and then Lemma 5, we derive 
\S\\T\H\ 



No{S,T)- 



P-1 






H 



Ee(-2oz) 



«p23/24^(p_l)g 



H 






H 



Ee(az) 



Ee(-2o2) 

a=l 2=1 

^p47/24^(^_ Ynp. 

Because Nq{ S, T) + Ni{S, T) = 1 5|| T| and H/p = 1/2 + 0(p~^) we see that 

\s\\r\H\ 



No{S,T)- 



^p47/24^(^_ 



as well. Therefore the discrepancy of / satisfies the bound 

A{f) < 2-2 VV24^(p - 1) Inp < p-^^^^T{p - 1) In p. 

Using (2), we derive that A{f) <C 2-"/24+°("), Applying Lemma 1, we obtain 
the desired result. □ 

The same considerations also imply the following estimate on the Fourier 
coefficients. 

Theorem 2. For for the Fourier coefficients of the function f{x, y) given by (1), 
the hound 



max 

Bn 



f{u,v) 



^ 2“"/24+o(n) 



holds. 



Proof. We fix some nonzero vectors u,v € Bn and denote by Xq and the sets 
of integers x & Bn and y G Bn for which {ux) = 0 and {vy) = 0, respectively. 
Similarly, we define the sets X\ and by the conditions {ux) = 1 and {vy) = 1, 
respectively. Then we obtain, 

f{u,v) = 2 - 2 " ^ ^ + 2 - 2 " ^ ^ (_!)/(-.*/) 

a:G yG yG yi 

_2~2n E E _ 2-2" E E 

x£ Xi ye yo x£ Xo ye yi 
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It is easy to see that 

^ ^ (_l)/(-.y) = 2iVo(A'„3^^)-|A’,||3^^|, r;,/x = 0,l, 

a;e Xrj i/S y,j. 

where, as before, Nq{X^, y^) is the number of pairs {x,y) G X.^ x with 
f{x,y) = 0. Using the same arguments as in the proof of Theorem 1, we derive 
that 



No{X^,y^) 









r],y = 0, 1, 



and from (2) we derive the desired result for non-zero vectors u and v. 

Now, if M = 0 is a zero vector and v is not then defining and as before 
we obtain 



f( 0 ,v) = 2 -^^ ^ - 2-2" ^ ^ (_1)/U.y) 

xGBnV^yO ^sGBnV^yi 



As before we derive 



which implies the desired estimate in this case. The same arguments apply if 
w = 0 is a zero vector and u is not. 

Finally, if both u = v = 0 are zero vectors then 

/(0,0) = 2-2" ^ ^ (_1)/U.y) = 2-2" {2No{B„, B^) - 

xe Bn 1/6 Bn 

and using the bound 

|fVo(,B„, ,B„)-22 "-i| <^p‘^'^/‘^‘^T{p-l)lnp, /4 = 0,1, 

we conclude the proof. □ 



4 Remarks 

Our bound of the discrepancy of /, obtained in the proof of Theorem 1, com- 
bined with Lemma 2.2 of [2] can be used to derive a linear lower bound on 
e- distributional communication complexity, which is defined in a similar way, 
however the communicating parties are allowed to make mistakes on at most 
£22" inputs x,y G Bn- 
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Similar considerations can be used to estimate more general sums 

A’fc)= ^ ... ^ 

xiEA^i x/cEA^k 

with k >2 sets dfi , . . . , C Bn and thus study the multi-party communication 
complexity of the function (1). 

It is obvious that the bound of Lemma 3 can be improved if a nontrivial 
upper bounds on |3^(d)| is known and substituted in (3). Certainly one cannot 
hope to obtain such bounds for arbitrary sets y but for cylinders such bounds 
can be proved. Unfortunately this does not yield any improvement of Theorems 1 
and 2. Indeed, nontrivial bounds on | y{d)\ improve the statement of Lemma 3 
for sets of small cardinality, however in our applications sets of cardinality of 
order p turn out to be most important. But for such sets the trivial bound 
I y{d) I < p/d, which has been used in the proof of Lemma 3, is the best possible. 

The bound of Lemma 3 also implies the same results for modular exponen- 
tiation (mod p), u,x G Bn- It would be interesting to extend this result for 
modular exponentiation modulo arbitrary integers m. In some cases, for example, 
when m contains a large prime divisor, this can be done within the frameworks 
of this paper. Other moduli may require some new ideas. 

Finally, similar results hold in a more general situation when g is an element 
of multiplicative order t > with any fixed (5 > 0, rather than a primitive 

root. 

On the other hand, out method does not seem to work for Boolean functions 
representing middle bits of and obtaining such results is an interesting open 
question. 
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Abstract. The Quintic Reciprocity Law is used to produce an algo- 
rithm, that runs in polynomial time, and that determines the primality 
of numbers M such that — 1 is divisible by a power of 5 which is 
larger that %/M, provided that a small prime p, p = l{mod5) is given, 
such that M is not a fifth power modulo p. The same test equations are 
used for all such M. 

If M is a fifth power modulo p, a sufficient condition that determines 
the primality of M is given. 



1 Introduction 

Deterministic primality tests that run in polynomial time, for numbers of the 
form M = A5” — 1, have been given by Williams, [9]. Moreover Williams and 
Judd [11], also considered primality tests for numbers M, such that M^±l, have 
large prime factors. A more general deterministic primality test was developed by 
Adleman, Pomerance and Rumely [1], improved by H. Cohen and H.W. Lenstra 
[4], and implemented by H. Cohen and A.K. Lenstra [5]. Although this is more 
general, for specific families of numbers one may find more efficient algorithms. 

This is what we give in this paper, for numbers M, such that — 1 is 

divisible by a large power of 5. More specifically let M = A5" ± where 

0 < A < 5”; 0 < u;„ < 5"/2; = l(mod5"). 

In the given range there are exactly two possible values for w„. One is = 

1 and the other is computed inductively via Hensel’s lemma. Thus, given uin 

satisfying = — l(mod5"), there is a unique x{mod5), such that = 

— l(mod5”+^). 

Once x{mod5) is found select w„+i = tOn + a;5" or u>n+i = 5" — (o;„ -I- x5") 
according to which one satisfies w„+i < 5”+^/2. 

For such integers M we use the Quintic Reciprocity Law to produce an 
algorithm, which runs in polynomial time, that determines the primality of M 
provided that a small prime p, p = l{mod5), is given, such that M is not a fifth 
power modulo p. 

We next describe the theorem that leads naturally to the algorithm. 

Let C = be a fifth complex primitive root of unity. 
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Let D = Z[C] be the corresponding Cyclotomic Ring. Let tt be a primary 
irreducible element of D lying over p. Let K = Q(C + = Q(-\/5). Let 

G= Ga/(Q(C)/Q) be the Galois Group of the cyclotomic field Q(C) over Q. For 
every integer c denote by CTc the element of G that sends ( in For r in Z[G] 
and a in D we denote by to the action of the element r of Z[G1 on the element 
a of L>. 

Let / be the order of M modulo 5 (f is also the order of M modulo 5"). Denote 
by ^f{x) the f-th Gyclotomic Polynomial. We note that = 0(mod5"). 

For / = 1 and / = 2 let 7 = For / = 4 let 7 = tt. For all cases let 

a = (7/7)‘^/(^)/5"^ where bar indicates complex conjugation. 

Let To = TraceK/Q^a + d), 
and Nq = NorrriK/Qio: + a) 

For k > 0 define Tk+i, Nk+i recursively by the formulas: 

Tfc+i = T| - 5NkTi + 5N^n + - ST^ + 5Tfe (1.1) 

iVfc+i = Nl - 5Nl{Ti - 2Nk) + 5Nk[{Ti - 2Nkf - 2Nl] 

+2bNl-2bNk{Tl-2Nk)+2bNk (1.2) 

Let nf=i Pj{^) ^ factorization modulo M of the polynomial (j)^{x) as a prod- 
uct of irreducible polynomials. Let pj = be the ideal of D generated 

by M and Pj{C)- 

We prove the following Theorem: 

Theorem 1. Let M, A, as before and suppose that M is not divisible by 
any of the solutions of x^ = l(mod5"); 1 < x < 5”. The following statements 
are equivalent: 

i) M is prime 

a) For each there is an integer ik ^ 0{mod5) such that 

^ = Cf>‘{^modp,k) (1-3) 

Hi) 

T„_i = iV„_i = -l(modM) (1.4) 

We note that the equivalence of (i) and (ii) is an extension of Proth Theorem, 

and the equivalence of (i) and (iii) extends the Lucas-Lehmer Test. 

We use the Quintic Reciprocity Law to extend Proth’s Theorem, in the same 
way Guthman [6] and Berrizbeitia-Berry [3] used the Gubic Reciprocity Law to 
extend Proth’s Theorem for numbers of the form A3” ± 1. From this extension 
of Proth’s Theorem we derive a Lucas-Lehmer type test, by taking Traces and 
Norms of certain elements in the field Q(-\/5), in a way which is analogous to 
Rosen's proof [8] of the Lucas-Lehmer test. Generalization of this scheme to a 
wider family of numbers is the object of a forthcoming paper. 

In section 2 of this paper we introduce the quintic symbol, and state the facts 
we need from the arithmetic of the ring D, including the Quintic Reciprocity 
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Law. In section 3 we prove theorem 1. Section 4 is devoted to remarks that 
have interest on their own, and are useful for implementation. We include a 
theorem that gives a sufficient condition that determines the primality of M 
when assumption on p is removed. In section 5 we describe an implementation of 
the algoritm derived from theorem 1. Work similar to this had been done earlier 
by Williams [10]. Williams derived his algoritm from properties of some Lucas 
Sequences. Our algorithm is derived from a generalization of Froth’s Theorem, 
and gives a unified treatment to test primality of numbers M such that — 1 is 
divisible by a large enough power of 5. In particular, an interesting observation 
is that the algorithm we use to test numbers M of the form A5” + 1 is the same 
as the one we use to test numbers of the form A5" — 1, which was not the case 
for earlier algorithm we found in the literature. 

2 The Ring D. Quintic Symbol and Quintic Reciprocity 

What we state in this section may be found, among other places, in [7], chapter 
12 to 14, from where we borrow the notation and presentation. 

Let D = Z[Q the ring of integer of the cyclotomic field Q(C)- Let p be a 
rational prime, p 5. Let / the order of p modulo 5. Then p factors as the 
product of 4// prime ideals in D. If V and V are two of these prime ideals, 
there is a cr in G = Ga/(Q(C)/Q) such that (j{V) = V . D/V is a finite field 
with p^ elements and is called the residue class field mod V. The multiplicative 
group of units mod V, denoted by {D/P)* is cyclic of order {p^ — 1). Let a 
in D an element not in V. There is an integer i, unique modulo 5 such that 
Q,(p -i)/5 = C/'{modV). The quintic symbol {a/V) is defined to be that unique 
fith root of unity satisfying 

^(p^-i)/5 ^ {a/V){modV) (2.1) 

The symbol has the following properties: 

{a/V) = 1 if, and only if. 



= a{modV) 



( 2 . 2 ) 



is solvable in D. 
For every a G G 



{a/vy = {a{a)/a{V)) (2.3) 

Let A be an ideal in D, prime to 5. Then A can be written as a product 
of prime ideals: A = Vi - ■ ■ Vs- Let a G D he prime to A. The symbol {a/ A) is 
defined as the product of the symbols {a/V\) - ■ ■ {a /Vs)- Let (i G D prime to 5 
and to a. The symbol (a//3) is defined as (a/ (/?)). 

D is a Principal Ideal Domain (FID) (see the notes in page 200 of [7] for 
literature on cyclotomic fields with class number one). An element a G D \s 
called primary if it is not a unit, is prime to 5 and is congruent to a rational 
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integer modulo (1 — For each a € D, prime to 5, there is an integer c in Z, 
unique modulo 5, such that ('“a is primary. In particular, every prime ideal V 
in D has a primary generator tt. 

Quint ic Reciprocity Law: 

Let M be an integer, prime to 5. Let a be a primary element of D and assume 
a is prime to M and prime to 5. Then 

{a/M) = (M/a) (2.4) 

3 Proof of the Theorem 

The condition imposed on the prime p implies p = l{mod5) (otherwise the 
equation = M{modp) would have an integer solution). It follows that the 
ideal (p) factors as the product of four prime ideals in D. These are all principal, 
since D is a PID. We denote by tt a primary generator of one of these prime 
ideals. The other ideals are generated by the Galois conjugates of tt, which are 
also primary. 

We note that (M/tt) ^ 1, otherwise M would be a fifth power modulo each 
of tt’s Galois conjugates, hence modulo p. We prove 
i) implies ii) 

Suppose first / = 1. 

Let (M/tt) = Then ii ^ 0{mod5). Since M is a rational prime, M = 
l(mod5), then (M) factors in D as the product of 4 prime ideals. We write 
(M) = (pi)(cr2(pi))(pi)(cr2(/2i)). We get 

= (M/tt) = (tt/M) {by{2A)) 

= (VMi)(7rM(Aii))(7r/Mi)(7r/o-2(Ai)) 

{because(M) = (pi)(cr2(/xi))(/2i)(cr2(Mi))) 

= (?/Mi)(?M(Mi)) = (?/m)(^3(?)/(m))-^ = ((?)^-^"Vm) 

TT TT TT TT TT 

(6y(2.3)) 

= ((?)i-3"^)('^-i)/"(mod/xi) {by{2.1)) 

TT 

= (modpi) {since(j)i{M) = M — 1) 

Next suppose / = 2. In this case (M) = (/r)(cr 2 (p))- Again we use (2.4), (2.3) 
and (2.1). This time we get: 

There is an integer Z 2 ^ 0{mod5), such that 



C = i^/M) = 
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Noting that raising to the Mth power mod ^ is same as complex conjuga- 
tion mod /r and that 4>2{M) = M + 1 we get the result. Finally, if / = 4, 
(M) remains prime in D. We get (M/tt) = (tt/M) = M) = 

^{M -i)(M M). This time raising to the power is equivalent to 

complex conjugation and so we obtain the desired result □. 

ii) implies iii) 

For fc > 0 let Tfc = TraccK/qia^ -I- d® ) and Nk = NormK/Q{ct^ + d® ). 
We claim that Tk and Nk satisfy the recurrent relations given by (1.1) and (1.2). 
To see this we let Ak = a° + a° and Bk = cr 2 (dfc). 

So Tfc = Tlfc -|- Bk and Nk = AkBk 
We first will obtain (1.1). 

Raising Tk to the fifth power we get 


A\ + BI = T| - mk{Al + Bl) - IQNlTk 


(3.1) 


Computing we obtain: 




Al + Bl = Ti - 3NkTk 


(3.2) 


On the other hand, keeping in mind that d = a~^ inverse 


one gets: 


= j^k+i + 5((a® )^ -|- (a ® )^) -|- lOAk 


(3.3) 


and 




Al = {a^")^ + {a-^")^ + 3Ak 


(3.4) 


Combining (3.3) with (3.4) leads to: 




Ak+i = ~ 52l| -|- 3Ak 


(3.5) 


Similarly, one obtains 




Bk+i = Bl- 5Bl + 5Bk 


(3.6) 


Adding (3.5) with (3.6) we get 




Tfc+i = {Al + Bl) - 5{Al + Bl) + 5Tk 


(3.7) 



Subtituting (3.1) and (3.2) in (3.7) we obtain (1.1) 

To obtain (1.2) we first multiply (3.5) and (3.6). This leads to: 

iVfc+i = - 5Nl{Al + Bl) + 5N,{At + Bt) 

+ 25iV| - 25Nk{Al + Bl) + 25Nk (3.8) 



Next we note: 



Al + Bl = T| - 2Nk 



(3.9) 
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from where we deduce 



Ai + Bt = {Tl - (3.10) 

(1.2) is then obtained by substituting (3.9) and (3.10) in (3.8). 

Since we have proved that and satisfy the recurrence relations given 
by (1.1) and (1.2), ii) implies that T„_i = (C + C~^) + (C^ + C~^) = 

Since T„_i is a rational number then the congruence holds modulo /r p| Q = 
M. Similarly we get fV„-i = —l{modM) □. 

iii) implies i) 

We will show that under the hypothesis every prime divisor Q of M is larger 
than square root of M. This will imply that M is prime. Let Q be a prime divisor 
of M. Let Q be a prime ideal in D lying over Q. Clearly, (1.4) holds modulo Q. 
We will show that also (1.3) holds modulo Q. 

From 



T„_i = Tracex/Q^ct^ + ) = —l{mod Q) 

and 

Nn-i = NormK/Q{oi^ + d® ) = —l{modQ), 

we deduce that (a® + d® ) has the same norm and trace modulo Q than 

(C + C~^)) it follows that (a^" ^ + d®" ^) = (^ + <^“^) (modQ) or (a®" ^ + 
d® ) = (^^ + (^“^) {mod Q). This fact, together with the fact a~^ = a leads to 
^5” = ({^{mod Q) for some i ^ 0{mod 5). Hence the class of a{mod Q) has order 

5" in the multiplicative group of units (Q/Q)*. It follows that 5" divides the 
order of this group which is a divisor In other words, Q^ — l = 0{mod 5") . 

Since by hypothesis no solution of this last congruence equation less than 5" is a 
divisor of M it follows that Q is larger than 5" that in turn is larger than square 
root of M, by the hypothesis made on A. □ 



4 Remarks on the Implementation 

In this section we will make remarks on the Implementation, and find Tq and 
Nq. We will also study what happens if M is a fifth power modulo p. 

— Although in principle part ii) of theorem 1 provides an algorithm for testing 
the primality of M, it assumes that a factorization of (j> 5 {x) modulo M is 
given. If M is not a prime the algorithm that finds this factorization may 
not converge. Part iii) instead gives an algorithm easy to implement provided 
that Nq and Tq are computed. 

— Note that the recurrence relations (1.1) and (1.2) are independent of the 
value of p. This is the case because d = a~^. 
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~ In practice A is fixed, while n is taken in a range of values which vary from 
relatively small to as large as possible. In the cases / = 1 and / = 2 we 
obtain 

To = TraccKiqia + a) = Trace and 

Nq = NorniK/Qict + d) = N orniK / q{i / Hence Tq and No 

are computable with 0{logA) modular operations. 

When / = 4 the calculation of a{modM) is longer. In fact, in this case 
+i)/ 5 ^ The exponent this time is very big, and the calculation 
of To and Nq in this case involve a lot of work. The calculation is still done 
with 0{logM) modular operations, but not anymore with 0{logA), as it is 
in the cases of / = 1 and / = 2. The following observation reduces somewhat 
the amount of work involved in the computation of a{mod M) for the case 

/ = 4. 

When dividing (M^ + 1) /5" by M one obtains: 

(M^ + l)/5” = AM+ {wl + l)/5" ± Awn 

The calculation of a{mod M) is therefore simplified by keeping in mind that 
raising ( 7 / 7 ) to the Mth power modulo M is equivalent to applying 172 or 
(T 3 , according to the congruence of M (mod 5). 

— If p = I (mod 5) and M is a fith power modulo p, the following proposition 
provides a sufficient condition to prove the primality of M . 

Proposition 1. If Tk = Nk = —l{modM) for some k such that 5* is 
larger than square root of M and if no nontrivial solution of = l(mo(i5^) 
, X < , is a divisor of M , then M is prime. 

The proof of this proposition goes along the lines of iii) implies i) in theorem 
1, the key point being that a has order 5^mod Q, which obliges Q to be too 
large or a smaller solution of = lmo(i(5*) □. 

This proposition may be particularly useful when A is much smaller than 
5”. 



5 Implementation 



Table 1 below consist of a 2x2 matrix containing all number w„, 0 < n < 25, 
such that w'^ + I = 0(mod5”); 0 < < 5" ; = ±2(mod5). The first 

column contains exactly those w„ which are congruent to 2(mod5) and the 
second column those which are congruent to 5{mod5). The term n + 1 of the 
first column, Wn+i, is obtained from the nth term of the same column by the 
following formula: 



Wn+l = Wn 




mod{ 5) 



5”, wi = 2 



For the second column we use 



Wn+l =Wn + 



wl + 1 



mod{ 5) 



5", Wl = 3 
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Table 1. OJn 



n 


w„(wi = 2) 


‘-^n (^1 — 3) 


1 


2 


3 


2 


7 


18 


3 


57 


68 


4 


182 


443 


5 


2057 


1068 


6 


14557 


1068 


7 


45807 


32318 


8 


280182 


110443 


9 


280182 


1672943 


10 


6139557 


3626068 


11 


25670807 


23157318 


12 


123327057 


120813568 


13 


123327057 


1097376068 


14 


5006139557 


1097376068 


15 


11109655182 


19407922943 


16 


102662389557 


49925501068 


17 


407838170807 


355101282318 


18 


3459595983307 


355101282318 


19 


3459595983307 


15613890344818 


20 


79753541295807 


15613890344818 


21 


365855836217682 


110981321985443 


22 


2273204469030182 


110981321985443 


23 


2273204469030182 


9647724486047943 


24 


49956920289342682 


9647724486047943 


25 


109561565064733307 


188461658812219818 
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Table 2 below also consist of a 24x2 matrix, this time the Ath term of the 
first column contains a list of values of n , 0 < n < 100, such that M = A5” + rc„, 
with Wn = 2{mod5), is prime, followed by the time it took a Pentium II , 350 
mhz, to compute them, using the programm we next describe. Maple was used 
for implementation. 



Table 2. Primes for wi = 2, 3 



A 


oj-i — 2 


LOl — Z 




1 < n < 100 


time 


1 < n < 100 


time 


1 


1 


28.891 


2,3,6,16,17,25 


31.563 


2 


3,20,57,73 


39.943 


1,4,31 


26.087 


3 


1,22,24 


27.705 


3,12,73,77,82 


34.346 


4 


2,3,5,17 


24.494 


1 


27.809 


5 


4,9,64 


27.938 


5,6 


35.504 


6 


2,5 


35.372 


15,39 


27.162 


7 


1,34 


28.933 


2,5,16,35 


36.022 


8 


14 


35.883 


1,4,24 


28.936 


9 


1,4,29,59 


27.788 


3,7,55 


36.717 


10 


2,3,10,11,13,43 


37.103 


1 


29.457 


11 


4,61,86 


28.533 


2,43,94 


36.183 


12 


2,27,32,63,73 


36.900 


21 


25.896 


13 


1,8,33,34,56 


28.671 


3,11,17,18,30,35,37,46,48 


37.445 


14 


7,19,72 


36.126 


1,24,92 


28.857 


15 


- 


44.894 


68,72 


38.615 


16 


5,13,17 


37.311 


1,28,76 


29.468 


17 


28 


28.510 


2,5,11,27 


36.624 


18 


2,11,54,57 


36.766 


28,59 


30.104 


19 


1,15,21,23,69 


28.971 


5,7,35,81 


38.568 


20 


3,14 


38.138 


1 


31.106 


21 


1 


28.237 


3,13,14,19,42,57 


38.671 


22 


2,7,12,16,75 


36.921 


1,8,56 


30.001 


23 


4,8 


29.075 


2,58,81 


38.983 


24 


2,78 


36.275 


4 


30.680 



The first column of the table 3 contains the values of n for which A5" + 1 is 
prime and the second column those values for which A5" — 1 is prime. 

5.1 Description of the Algorithm 

Some precomputation is needed. 

We fix the primes p = 11, 31, 41, 61. 

For each of these primes we found a prime element of the ciclotomic 
ring D, which we will denote by IIp(C,), lying over p (this means that 
|iVorTOQ(,j)/Q(i7p(C))| =p). 
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Table 3. Primes for = 1, — 1 



A 


All — 1 


Wl = — 1 1 




1 < n < 100 


time 


1 < n < 100 


time 


X 


1,3,13,45 


12.013 


4,6,16,24,30,54,96 


12.321 


~T 


2,6,18,50 


14.149 


1,3,9,13,15,25,39,69 


12.497 


IT 


1,2,3,23,27,33,63 


12.158 


1,2,5,11,28,65,72 


13.058 


X 


1 


12.715 


2,4,8,10,28 


13.219 


12 


1,5,7,18,19,23,46,51,55,69 


12.893 


1,3,4,8,9,28,31,48,51,81 


13.309 


14 


1,7,23,33 


13.239 


2,6,14 


13.587 


16 


2,14,22,26,28,42 


13.072 


1,3,5,7,13,17,23,33,45,77 


13.446 


18 


3,4,6,10,15,30 


13.199 


1,2,5,6,9,13,17,24,26,49,66 


13.577 


22 


4,10,40 


13.907 


1,3,5,7,27,35,89 


14.085 


2A 


2,3,8,19,37,47 


12.921 


2,3,10,14,15,23,27,57,60 


13.715 



We note that the left side of equation (1.3) does not vary if ITp(C) is replaced 
by another prime lying over p when n > 2. Therefore the condition of primary 
may be disregarded, hence we let i7ii(C) = (C + 2), .^ 3 i(C) = (C ~ 2), iT 4 i(C) = 
(C^ + 2C^ + 3C + 3), iT6i(C) = (C + 3). 

For the case / = 1 and / = 2 (M = A5" ± 1) we let 



l^pJ 




1-3(t 



for the case / = 4 (or M = A5” + = ±2), we let 



^p,f — 




and 



Tpj,A,u = Ttk/q (ModM) 

Npj,A,n = NorniK/Q (^fdt ^ ^ iO^ (ModM) 

The program finds the first values of p for which M is not a fith power. If the 
condition is not satisfied a note is made and these number are later tested by 
other means. 

Otherwise we set Tq = Tpj^A,n and Nq = Npj^A,n and we use the recurrence 
equation (1.1) and (1.2) to verify if (1.4) holds. 

When /= 1 or 2 we note that (pf{M)/5^ = A. Hence Tpj^A,n depends only 
on A, not on n. In this case, for relatively small values of A, we recommend to 
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compute the value of 

Ttk/q + (C)^ {notmoduloM) 

and 

NorniK/Q iO + (C)^ {notmoduloM) 

These same numbers may be used as the starting number Tq and Nq for all 
numbers n in a given range. If for a fixed value of A the calculation of Tq and Nq 
is counted as part of the precomputation, then the complexity of the primality 
test for numbers of the form A5" ± 1, which are not congruent to a fith power 
modulo p, is simply the complexity of the calculation of the recurrence relations 
(1.1) and (1.2) n-1 times. 

When / = 4, ^/(M)/5" is large and depends on A and n. In this case, even 
for small values of A, the computation of Tq and Nq is, for each value of M, of 
approximately same complexity as the the computation of T„_i, Nn-i, given Tq 
and Nq. 
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Abstract. This paper shows that the largest possible contrast Ck,n in 
an fc-out-of-n secret sharing scheme is approximately More pre- 
cisely, we show that < Ck,n < (n(n — 1) • • • (n — (fc — 

1))). This implies that the largest possible contrast equals in the 

limit when n approaches infinity. For large n, the above bounds leave 
almost no gap. For values of n that come close to k, we will present 
alternative bounds (being tight for n = k). The proofs of our results pro- 
ceed by revealing a central relation between the largest possible contrast 
in a secret sharing scheme and the smallest possible approximation error 
in problems occuring in Approximation Theory. 



1 Introduction 

Visual cryptography and /c-out-of-n secret sharing schemes are notions intro- 
duced by Naor and Shamir in [10]. A sender wishing to transmit a secret mes- 
sage distributes n transparencies among n recipients, where the transparencies 
contain seemingly random pictures. A fc-out-of-n scheme achieves the following 
situation: If any k recipients stack their transparencies together, then a secret 
message is revealed visually. On the other hand, if only k — 1 recipients stack 
their transparencies, or analyze them by any other means, they are not able to 
obtain any information about the secret message. The reader interested in more 
background information about secret sharing schemes is referred to [10]. 

An important measures of a scheme is its contrast, i.e., the clarity with which 
the message becomes visible. This parameter lies in interval [0,1], where con- 
trast 1 means “perfect clarity” and contrast 0 means “invisibility”. Naor and 
Shamir constructed k-out-of-k secret sharing schemes with contrast and 

were also able to prove optimality. However, they did not determine the largest 
possible contrast Ck,n for arbitrary k-ont-of-n secret sharing schemes. 

In the following, there were made several attempts to find accurate estima- 
tions for the optimal contrast and the optimal tradeoff between contrast and 
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subpixel expansion for arbitrary fc-out-of-n secret sharing schemes [4],[5],[1],[2], 
[3]. For k = 2 and arbitrary n this problem was completely solved by Hofmeister, 
Krause, and Simon in [5]. But the underlying methods, which are based on the 
theory of linear codes, do not work for k > 3. Strengthening the approach of 
Droste [4], the first step in the direction of determining Ck,n for some values k 
and n, where k > 3, was taken in [5]. They presented a simple linear program 
LP(fc,n) whose optimal solution represents a contrast-optimal /c-out-of-n secret 
sharing scheme. The profit achieved by this solution equals Ck,n- Although, Ck,n 
was computable in poly(n) steps this way, and even elemantary formulas were 
given for k = 3,4, there was still no general formula for Ck,n (or for good 
bounds). Based on computations of Ck,n for specific choices of k,n, it was con- 
jectured in [5] that Ck,n > 4“^^“^^ with equality in the limit when n approaches 
infinity. In [2] and [3], some of the results from [5] concerning fc = 3,4 and 
arbitrary n could be improved. Furthermore, in [3], Blundo, D’Arco, DeSantis 
and Stinson determine the optimal contrast of /c-out-of-n secret sharing schemes 
for arbitrary n and k = n — 1. 

In this paper, we confirm the above conjecture of [5] by showing the following 
bounds on Ck,n- 



4_(fc_i) < c' < 4-(fe-i) 



i{n — 1) • • • (n — (A: — 1)) 



This implies that the largest possible contrast equals 4“*^^“^) in the limit when 
n approaches infinity. For large n, the above bounds leave almost no gap. For 
values of n that come close to k, we will present alternative bounds (being 
tight for n = k). The proofs of our results proceed by revealing a central relation 
between the largest possible contrast in a secret sharing scheme and the smallest 
possible approximation error in problems occuring in Approximation Theory. A 
similar relation was used in the paper [8] of Linial and Nisan about Approximate 
Inclusion- Exclusion (although there are also some differences and paper [8] ends- 
up with problems in Approximation Theory that are different from ours). 



2 Definitions and Notations 

For the sake of completeness, we recall the definition of visual secret sharing 
schemes given in [10]. In the sequel, we simply refer to them under the notion 
scheme. For a 0-1-vector v, let H{v) denote the Hamming weight of v, i.e., the 
number of ones in v. 

Definition 1. A k-out-of-n scheme C = (Cq,Ci) with m subpixels, contrast a = 
a{C) and threshold d consists of two collections of Boolean nxm matrices Cq = 
[Cop, . . . , Co,r] and Ci = [Cip, . . . , Ci^s] ■, such that the following properties are 
valid: 

1. For any matrix S € Cq, the OR v of any k out of the n rows of S satisfies 
H{v) < d — am. 
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2. For any matrix S & C\, the OR v of any k out of the n rows of S satisfies 
H{v) > d. 

3. For any q < k and any q-element subset {ii, . . . ,ig} C {1, . . . ,n}, the two 
collections of q x m matrices Vq and T>i obtained by restricting each n x m 
matrix in Cq and C\ to rows ii, . . . ,iq are indistinguishable in the sense that 
they contain the same matrices with the same relative frequencies. 

fc-out-of-n schemes are used in the following way to achieve the situation 
described in the introduction. The sender translates every pixel of the secret 
image into n sets of subpixels, in the following way: If the sender wishes to 
transmit a white pixel, then she chooses one of the matrices from Co according 
to the uniform distribution. In the case of a black pixel, one of the matrices from 
Cl is chosen. For all 1 < t < n, recipient i obtains the z-th row of the chosen 
matrix as an array of subpixels, where a 1 in the row corresponds to a black 
subpixel and a 0 corresponds to a white subpixel. The subpixels are arranged in 
a fixed pattern, e.g. a rectangle. (Note that in this model, stacking transparencies 
corresponds to “computing” the OR of the subpixel arrays.) 

The third condition in Definition 1 is often referred to as the “security prop- 
erty” which guarantees that any A: — 1 of the recipients cannot obtain any in- 
formation out of their transparencies. The “contrast property”, represented by 
the first two conditions in Definition 1, guarantees that k recipients are able to 
recognize black pixels visually since any array of subpixels representing a black 
pixel contains a “significant” amount of black subpixels more than any array 
representing a white pixel. ^ 

In [5], it was shown that the largest possible contrast Ck,n in an fc-out-of-n 
scheme coincides with the maximal profit in the following linear program (with 
variables ^o, • ■ • , and 770 , ... , 

Linear Program LP(fc,n) 

max subject to 

1. For j = 0, ■ . ■ ,n : fj > O.rjj > 0. 

2. = i 

3. For Z = 0 , . . . , fc - 1: ~ V,) = 0. 

The following sections only use this linear program (and do not explicitly refer 
to Definition 1). 

We make the following conventions concerning matrices and vectors. For 
matrix A, A' denotes its transpose (resulting from A by exchanging rows and 

^ The basic notion of a secret sharing scheme, as given in Definition 1, has been 
generalized in several ways. The generalized schemes in [1], for instance, intend to 
achieve a situation where certain subsets of recipients can work successfully together, 
whereas other subsets will gain no information. If the two classes of subsets are the 
sets of at least k recipients and the sets of at most fc — 1 recipients, respectively, we 
obtain (as a special case) the schemes considered in this paper. Another model for 
2-out-of-2 schemes involving three colors is presented in [11]. 
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columns). A vector which is denoted by c is regarded as a column vector. Thus, 
its transpose c' is a row vector. The all-zeros (column) vector is denoted as 0. 
For matrix A, Aj denotes its j-th row vector. A' denotes the j-th row vector of 
its transpose (as opposed to the transpose of the j’th row vector). 

3 Approximation Error and Contrast 

In Subsection 3.1, we relate the problem of finding the best /c-out-of-n secret 
sharing scheme to approximation problems of type BAV and BAP. Problem BAV 
(Best Approximating Vector) asks for the “best approximation” of a given vector 
c within a vector space V. Problem BAP (Best Approximating Polynomial) asks 
for the “best approximation” of a given polynomial p of degree k within the set 
of polynomials of degree A: — 1 or less. It turns out that, choosing c, V,p properly, 
the largest possible contrast is twice the smallest possible approximation error. 
In Subsection 3.2, we use this relationship to determine lower and upper bounds. 
Moreover, the largest possible contrast is determined exactly in the limit (when 
n approaches infinity). In Subsection 3.3, we derive a criterion that helps to 
determine those pairs (fc,n) for which Ck,n coincides with its theoretical upper 
bound from Subsection 3.2. 

3.1 Secret Sharing Schemes and Approximation Problems 

As explained in Section 2, the largest possible contrast in an fc-out-of-n secret 
sharing scheme is the maximal profit in linear program LP(fc,n). The special 
form exhibited by LP(fc,n) is captured by the more abstract definitions of a 
linear program of type BAV (Best Approximating Vector) or of type BAP (Best 
Approximating Polynomial) . 

We start with the discussion of type BAV. We say that a linear program LP 
is of type BAVii there exists a matrix A G jjfcx(i+«) and a vector c G such 

that LP (with variables ^ = (^O) ■ • ■ ) Cn) and rj = (? 7 o, . . . , t7„)) can be written in 
the following form: 

The primal linear program LP(A, c) of type BAV 

max c'(^ — rj) subject to 
(LPl) i > 0,T7 > 0 

(LP2) = = i 

(LP3) A(^ -v)=0 

Condition (LP2) implies that 

n 

XI - »7i) = 0. 

j=o 

Thus, we could add the all-ones row vector (!,...,!) to matrix A in (LP3) 
without changing the set of legal solutions. For this reason, we assume in the 
sequel that the following condition holds in addition to (LPl), (LP2), (LP3): 
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(LP4) The vector space Va spanned by the row vectors of A contains the all-ones 
vector. 

We aim to show that linear program LP(A, c) can be reformulated as the 
problem of finding the “best” approximation of c in Va- To this end, we pass to 
the dual problem^ (with variables s, t and u = (ug, ■ ■ ■ , Uk-i))'- 

The dual linear program DLP(A, c) of type BAV 

min s+t subject to 
(DLPl) A'u + (s, . . . , s)' > c 
(DLP2) A'u - <c 

Conditions (DLPl) and (DLP2) are obviously equivalent to 

s > max (ci — A'u) and t > max (A'^u — cA, 

j=0,...,n j=0,...,n ■’ 

and an optimal solution certainly satisfies 

s = max (ci — A'au) and t = max (A'^u — cA- 

j=0,...,n j=0,...,n ■’ 

Note that vector A'u is a linear combination of the row vectors of A. Thus, 
Va = {A'u\ u G K'^}. DLP(A,c) can therefore be rewritten as follows: 

min max (c,- — vA + max (vj — cA 
v&Va [j=0,...,n 

Consider a vector v £ Va and let 

j-{v) = arg max {cj — vA and j+{v) = arg max {vj — c^). 

j=0,...,n j=0,...,n 

Term S{v) := Cj_(„) — represents the penalty for being smaller than 

A-V)- Symmetrically, L{v) := — Cj+{v) represents the penalty for 

being larger than Note that the total penalty S{v) + L{v) does not change 

if we translate u by a scalar multiple of the all-ones vector (1, . . . , 1)'. According 
to (LP4), any translation of this form can be performed within Va- Choosing 
the translation of v appropriately, we can achieve S{v) = L{v), that is, a perfect 
balance between the two penalty terms. Consequently, the total penalty for v 
is twice the distance between c and v measured by the metric induced by the 
maximum-norm. We thus arrive at the following result. 

Theorem 1. Given linear program LP{A, c) of type BAV, the maximal profit C 
in LP{A, c) satisfies 

C = 2 ■ min max Ic, — vJ. 

v&Va j=0,---,n ■' 

^ The rules, describing how the dual linear program is obtained from a given primal, 
can be looked up in any standard text about linear programming (like [12], for 
instance). 
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Thus, the problem of finding an optimal solution to LP(A, c) boils down to the 
problem of finding a best approximation of c in Va w.r.t. the maximum-norm. 

We now pass to the discussion of linear programs of type BAP. We call 
d G evaluation-vector of polynomial p € 5R[A] if dj = p{j) for j = 0, . . . , n. 

We say that a linear program LP(A, c) of type BAV is of type BAP if, in addition 
to Conditions (LP1),..,(LP4), the following holds: 

(LP5) c is the evaluation vector of a polynomial, say p, of degree k. 

(LP6) Matrix A € s|fjfex(i-i-n) j j^g vectors are linearly inde- 

pendent. 

(LP7) For I = 0, . . . , k—1, row vector A; is the evaluation vector of a polynomial, 
say qi, of degree at most k—1. 

Let Pm denote the set of polynomials of degree at most m. Conditions (LP6) 
and (LP7) imply that Va is the vector space of evaluation vectors of polynomials 
from Pk-i- Theorem 1 implies that the maximal profit C in a linear program of 
type BAP satisfies 

C = 2- min max |_p(j) — ^(j) |- 
qePk~l j=0,...,n 

Let A denote the leading coeffient of p. Thus p can be be written as sum of AA^ 
and a polynomial in Pk-i- Obviously, p is as hard to approximate within Pfc-i 
as |A|A^. We obtain the following result: 

Corollary 1. Given linear program LP{A, c) of type BAP, let p denote the poly- 
nomial of degree k with evaluation vector c, and A the leading coefficient of p. 
Then the maximal profit C in LP{A, c) satisfies 

C = 2- min max I |A| — < 7 (j) I . 

qePk-i j=0,...,n ' 

We introduce the notation 

n- = n(n — 1) • • • (n — (A: — 1)) 

for so-called “falling powers” and proceed with the following result: 

Lemma 1. The linear program LP{k,n) is of type BAP. The leading coefficient 
of the polynomial p with evaluation vector c is (— l)*/n-. 

The proof of this lemma is obtained by a close inspection of LP (k,n) and a 
(more or less) straightforward calculation. 

Corollary 2. Let Ck,n denote the largest possible contrast in an k-out-of-n se- 
cret sharing scheme. Then: 

Ck,n = 2 • min max /n- - q{j)\. 
qePk-l J=0,...,n 

Thus, the largest possible contrast in an fc-out-of-n secret sharing scheme 
is identical to twice the smallest “distance” between polynomial A^/n- and a 
polynomial in Pk-i, where the “distance” between two polynomials is measured 
as the maximum absolute difference of their evaluations on points 0, 1 . . . , n. 
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3.2 Lower and Upper Bounds 

Finding the “best approximating polynomial” of within Pk-i is a classical 
problem in Approximation Theory. Most of the classical results are stated for 
polynomials defined on interval [—1,1]. In order to recall these results and to 
apply them to our problem at hand, the definition of the following metric will 
be useful: 

doo{f,g)= max \f{x) - g{x)\ (1) 

a:e[-l,l] 

This definition makes sense for functions that are continuous on [—1, 1] (in par- 
ticular for polynomials) . The metric implicitly used in Corollaries 1 and 2 is dif- 
ferent because distance between polynomials is measured on a finite set of points 
rather than on a continuous interval. For this reason, we consider sequence 

2j 

Zj = — 1 -I — - for j = 0, . . . , n. (2) 

n 

It forms a regular subdivision of interval [—1, 1] of step width 2/n. The following 
metric is a “discrete version” of doo'- 

dn{f,g)= max \f{zj)-g{zj)\. (3) 

j=0,...,n 

Let Uk{X) = and the best approximation of Uk within Pk-i 

w.r.t. doo- Analogously, denotes the best approximation of U within Pk-i 
w.r.t. dn- 

Dk,oo = d^{Uk,Uloo) and [7^%) (4) 

are the corresponding approximation errors. It is well known from Approxima- 
tion Theory^ that 

[/,%(A)=A'=-2-('=-i)Tfc(A) (5) 

where denotes the Chebyshev polynomial of degree k (defined and visualized 
in Figure 1). It is well known that = cos{k9) is a polynomial of degree k in 
X = cos(0) G [—1,1] with leading coefficient 2^'“^. Thus, i® indeed from 
Pk-i- Since max_i<a;<i |T'fc(x)| = 1, we get 

77fe.oo = (6) 

Unfortunately, there is no such simple formula for Dk^n (the quantity we are 
interested in). It is however easy to see that the following inequalities are valid: 

(^1 - 2-('=-i) < Dk,n < Dk,^ = (7) 

Inequality Dk^n < LIfe.oo is obvious because dn{f,g) < doo{f,g) for all f,g. The 
first inequality can be derived from the fact that the first derivation of Tk is 

See Chapter 1.2 in [13], for instance. 
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Fig. 1. The Chebyshev polynomial Tk of degree k for k = 1,2,3. Tk{X) = cos{k9), 
where 0 < 9 < tv and X = cos{9). 



bounded by on [—1, 1] (applying some standard tricks). We will improve on 
this inequality later and present a proof for the improved statement. 

Quantities Dk^n and Ck,n are already indirectly related by Corollary 2. In 
order to get the precise relation, we have to apply linear transformation X — >■ 
+ 1), because the values attained by a function f{X) on X = 0,...,n 
coincide with the values attained by function / (|(X + 1)) on X = zq, ■ ■ ■ , Zn- 
This transformation, applied to a polynomial of degree k with leading coefficient 
A, leads to a polynomial of the same degree with leading coefficient A (^) . The 
results corresponding to Corollaries 1 and 2 now read as follows: 

Corollary 3. Given a linear program LP{A, c) of type BAP, let p denote the 
polynomial of degree k with evaluation vector c, and A the leading coefficient of 
p. Then the maximal profit C in LP{A, c) satisfies 

2 ) Dk,u- 

Plugging in (— l)^/n- for A, we obtain 

Corollary 4. The largest possible contrast in an k-out-of-n secret sharing scheme 
satisfies 

Ck,n = 

n- 

Since Dk,ao = we get the following result: 

Corollary 5. The limit of the largest possible contrast in an k-out-of-n secret 
sharing scheme, when n approaches infinity, satisfies 

Cfc.oo = lim = 

n—foo 
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The derivation of Ck,oo from -Dfe,oo profited from the classical Equation (6) 
from Approximation Theory. For n = k, we can go the other way and derive 
Dk,k from the fact (see [10]) that the largest possible contrast in an fc-out-of-fc 
secret sharing scheme is 

Ck,k = (8) 

Applying Corollary 4, we obtain 

Dk,k = y: (9) 

According to Stirling’s formula, this quantity is asymptotically equal to \/2'Kke~^. 
Equation (9) presents the precise value for the smallest possible approximation 
error when is approximated by a polynomial of degree fc — 1 or less, and the 
distance between polynomials is measured by metric dk- 

Sequence Ck,n monotonically decreases with n because the secret sharing 
scheme becomes harder to design when more people are going to share the secret 

(and threshold k is fixed). Thus, the unknown value for Ck,n must be somewhere 

between Ck,oo = and Ck,k = We don’t expect the sequence 

Dk,n to be perfectly monotonous. However, we know that Dk^n "£ Dk,oo- If 
n is a multiple of k, the regular subdivision of [—1, 1] with step width 2/n is a 
refinement of the regular subdivision of [—1, 1] with step width 2/k. This implies 
ddk,n ^ Dk,k- 

Figure 2 presents an overview over the results obtained so far. An edge from 
a to b with label s should be interpreted as 6 = s • a. For instance, the edges 
with labels rk,n, r'k Sfe,n represent the equations 

Ck^n — ‘kk,n ‘ Ck^oo with Tk^n ^ Ij 
Ck,k = r'k^n ■ Ck,n with > 1, 

Dk,n = s'k n ‘ Dk,k with sj, „ > 1 if n is a multiple of k, 

dfc.oo ~ ^k,n ‘ ddk^n with Sk^n ^ Ij 

respectively. The edges between Ck,n and Dk^n explain how Dk^n is derived from 
Ck,n and vice versa, i.e., these edges represent Corollary 4. Figure 2 can be used 
to obtain approximations for the unknown parameters rk,n, s'^ Sk,n- The 
simple path from Ck,oo = to Dk^oo = corresponds to equation 

2-(fc-i) _ s, . 2^-1 —.ri. ■ 

Using Tk^n > 1, Sfc,n > 1 and performing some cancellation, we arrive at 

' ^k,n — T* 

n- 

A similar computation associated with the simple path from to Ck,k leads 
to 




( 11 ) 
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Fig. 2. Sequence Ck,n, sequence Dk,n and relations between them. 



The following bounds on Ck,n and Dk,n are now evident from Figure 2 
and (10): 

< Ck,n = ( 12 ) 

n- 

< 2-(fc-i) (13) 

n'^ Sk,n 

In both cases, the upper bound exceeds the lower bound by factor /n- only 
(approaching 1 when n approaches infinity).^ An elementary computation® shows 
that 1 — fc^/n < n-/n* < 1 holds for all 1 < fc < n. Thus, (13) improves on the 
classical Inequality (7) from Approximation Theory. 

Although bounds (12) and (13) are excellent for large n, they are quite poor 
when n comes close to k. In this case however, we obtain from Figure 2 and (11) 

(r^2-(fc-i) < Ck,n < , (14) 

k\ n- , , 

< Dk,n < ^ , (15) 

where the first inequality in (15) is only guaranteed if n is a multiple of k. These 
bounds are tight for n = k. 

^ Because of (10), the two gaps cannot be maximal simultaneously. For instance, at 
least one of the upper bounds exceeds the corresponding lower bound at most by 
factor /n-. 

® making use of < 1 — a: < where the first inequality holds for all x £ [0, 1/2] 
and the second for all a; G 5R 
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3.3 Discussion 



Paper [5] presented some explicit formulas for but only for small values of 

k. For instance, it was shown that Ck,n matches • n^/n- (which is the 

theoretical upper bound, see (12)) if fc = 2 and n even and if fc = 3 and n is a 
multiple of 4. 

However, computer experiments (computing Ck,n by solving LP(k^ n) and 
comparing it with the theoretical upper bound from (12)) support the conjecture 
that there is no such coincidence for most other choices of fc, n. The goal of this 
subsection is to provide a simple explanation for this phenomenon. Exhibiting 
basic results from Approximation Theory concerning best approximating poly- 
nomials on finite subsets of the real line (see, e.g.. Theorem 1.7 and Theorem 

l. 11 from [13]), it is possible to derive the following 

Theorem 2. It holds that Ck,n = 4“^*“^) • n^/n- iff C Z„, where 



Ek = < cos 



{k — i)tt 



i = 0, ...,fcL 



2i 

Zn = {zo,...,Zn} = <{-1-1 1 i = 0, ...,n 

n 



Due to lack of space, for the proof of this result we refer to the journal version 
of this paper. It is quite straightforward to derive that E 2 C iff n is even, 
that E3 C iff n is divisible by 4, and that E^ 2 Z^ for all n and > A as E^ 
contains irrational numbers. Consequently, 

Corollary 6. It holds that Ck,n = • n^/n- iffk = 2 and n is even or if 

k = 3 and n is a multiple of 4. 



We conclude the paper with a final remark and an open problem. Based on 
the results of this paper, Kuhlmann and Simon [6] were able to design arbitrary 
fc-out-of-n secret sharing schemes with asymptotically optimal contrast. More 
precisely, the contrast achieved by their schemes is optimal up to a factor of at 
most 1 — k‘^/n. For moderate values of k and n, these schemes are satisfactory. For 
large values of n, they use too many subpixels. It is an open problem to determine 
(as precise as possible) the tradeoff between the contrast (which should be large) 
and the number of subpixels (which should be small). 
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Abstract. We study the average-case behavior of algorithms for finding 
a maximal disjoint subset of a given set of rectangles. In the probability 
model, a random rectangle is the product of two independent random 
intervals, each being the interval between two points drawn uniformly at 
random from [0, 1]. We have proved that the expected cardinality of a 
maximal disjoint subset of n random rectangles has the tight asymptotic 
bound Although tight bounds for the problem generalized to 

d > 2 dimensions remain an open problem, we have been able to show 
that and 0((n log'^’ are asymptotic lower and upper 

bounds. In addition, we can prove that is a tight asymptotic 

bound for the case of random cubes. 



1 Introduction 

We estimate the expected cardinality of a maximal disjoint subset of n rectangles 
chosen at random in the unit square. We say that such a subset is a packing 
of the n rectangles, and stress that a rectangle is specified by its position as 
well as its sides; it can not be freely moved to any position such as in strip 
packing or two-dimensional bin packing (see [2] and the references therein for 
the probabilistic analysis of algorithms for these problems) . A random rectangle 
is the product of two independent random intervals on the coordinate axes; each 
random interval in turn is the interval between two independent random draws 
from a distribution G on [0, 1]. 

This problem is an immediate generalization of the one-dimensional problem 
of packing random intervals [3] . And it generalizes in an obvious way to packing 
random rectangles (boxes) in d > 2 dimensions into the d-dimensional unit cube, 
where each such box is determined by 2d independent random draws from [0, 1], 
two for every dimension. A later section also studies the case of random cubes in 
d > 2 dimensions. For this case, to eliminate irritating boundary effects that do 
not influence asymptotic behavior, we wrap around the dimensions of the unit 
cube to form a toroid. In terms of an arbitrarily chosen origin, a random cube is 
then determined by d -I- 1 random variables, the first d locating the vertex closest 
to the origin, and the last giving the size of the cube, and hence the coordinates 
of the remaining 2*^ — 1 vertices. Each random variable is again an independent 
random draw from the distribution G. 
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Applications of our model appear in jointly scheduling multiple resources, 
where customers require specific “intervals” of a resource or they require a re- 
source for specific intervals of time. An example of the former is a linear com- 
munication network and an example of the latter is a reservation system. In a 
linear network, we have a set S of call requests, each specifying a pair of end- 
points (calling parties) that define an interval of the network. If we suppose also 
that each request gives a future time interval to be reserved for the call, then a 
call request is a rectangle in the two dimensions of space and time. In an unnor- 
malized and perhaps discretized form, we can pose our problem of finding the 
expected value of the number of requests in S that can be accommodated. 

The complexity issue for the combinatorial version of our problem is eas- 
ily settled. Consider the two-dimensional case, and in particular a collection of 
equal size squares. In the associated intersection graph there is a vertex for each 
square and an edge between two vertices if and only if the corresponding squares 
overlap. Then our packing problem specialized to equal size squares becomes the 
problem of finding maximal independent sets in intersection graphs. It is easy to 
verify that this problem is NP-complete. For example, one can use the approach 
in [1] which was applied to equal size circles; the approach is equally applicable to 
equal size squares. We conclude that for any fixed d > 2, our problem of finding 
maximal disjoint subsets of rectangles is NP-complete, even for the special case 
of equal size cubes. As a final note, we point out that, in contrast to higher di- 
mensions, the one-dimensional (interval) problem has a polynomial-time solution 

[3]. 

Let Sn be a given set of random boxes, and let (7„ be the maximum cardi- 
nality of any set of mutually disjoint boxes taken from S'„. After preliminaries in 
the next section. Section 3 proves that, in the case of cubes in d > 2 dimensions, 
E[C„] = and Section 4 proves that, in the case of boxes in d dimen- 

sions, E[C„] = f2(n^/^) and E[C„] = 0((nlog‘^“^ n)^/^). Section 5 contains our 
strongest result, which strengthens the above bounds for d = 2 by presenting a 
0(n^/^) tight upper bound. We sketch a proof that relies on a similar result for 
a reduced, discretized version of the two dimensional problem. 

2 Preliminaries 

We restrict the packing problem to continuous endpoint distributions G. Within 
this class, our results are independent of G, because the relevant intersection 
properties of G depend only on the relative ordering of the points that determine 
the intervals in each dimension. Thus, for simplicity, we assume hereafter that 
G is the uniform distribution on [0, 1]. 

It is also easily verified that we can Poissonize the problem without affecting 
our results. In this version, the number of rectangles is a Poisson distributed 
random variable with mean n, and we let G(n) denote the number packed 
in a maximal disjoint subset. We will continue to parenthesize arguments in the 
notation of the Poissonized model so as to distinguish quantities like G„ in the 
model where the number of rectangles to pack is fixed at n. 
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Let Xi, . . . , Xn be i.i.d. with a distribution F concentrated on [0,1]. We 
assume that F is regularly varying at 0 in that it is strictly increasing and that, 
for some ^ > 0, some constants K,K' G (0,1), and all x G (0,^), it satisfies 
G [K, K']. For (s„ G (0, 1]; n > 1) a given sequence, let Nn{F, s„) be the 
maximum number of the X^ that can be chosen such that their sum is at most 
ns„ on average. Equivalently, in terms of expected values, X„ is such that the 
sum of the smallest N„ of the Xi is bounded by ns„, but the sum of the smallest 
X„ + 1 of the Xi exceeds ns„. 

Standard techniques along with a variant of Bernstein’s Theorem suffice to 
prove the following technical lemma. 

Lemma 1. With F and {sn,n > 1) as above, let be the solution to 

Sn= [ xdF{x), (1) 

Jo 

and assume the s„ are such that lim Xn = 0. Then if lim nF{x„) = oo and 

n—^oo n—^oo 

nF(xn) = l7(log^s“^), we have 

E[iV„(F,s„)] -nF(x„). (2) 



3 Random Cubes 

The optimum packing of random cubes is readily analyzed. We work with a d- 
dimensional unit cube, and allow (toroidal) wrapping in all axes. The n cubes 
are generated independently as follows: First a vertex (vi,V 2 , ■ ■ ■ , Vd) is generated 
by drawing each Ci independently from the uniform distribution on [0, 1]. Then 
one more value w is drawn independently, again uniformly from [0, 1]. The cube 
generated is 

[vi,Vi +w) X [V2,V2 +w) X ■■■ X [vd,Vd + w) , 

where each coordinate is taken modulo 1. In this set-up, we have the following 
result. 

Theorem 1. The expected cardinality of a maximum packing ofn random cubes 
is ©(n^/^'^+i)). 



Proof: For the lower bound consider the following simple heuristic. Subdivide 
the cube into c~‘^ cells with sides 

where a is a parameter that may be chosen to optimize constants. For each cell 
C, if there are any generated cubes contained in C, include one of these in the 
packing. Clearly, all of the cubes packed are nonoverlapping. 
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One can now show that the probability that a generated cube fits into a 
particular cell C is -I- 1), and so the probability that C remains empty 

after generating all n cubes is 

^d+i \ " / \ \ 

~d^) ^ n(d+l)) - 

Since the number of cells is l/c"^ = / a‘^ , the expectation of the total 

number of cubes packed is 

which gives the desired lower bound. 

The upper bound is based on the simple observation that the sum of the 
volumes of the packed cubes is at most 1. First we consider the probability 
distribution of the volume of a single generated cube. The side of this cube is a 
uniform random variable U over [0,1]. Thus the probability that its volume is 
bounded by z is 

F{z) = Pr {U‘^ <z}= Pr {[/ < = z^/^ . 

Then applying Lemma 1 with s„ = 1/n, x„ = {{d + , and 

F{xn) = ((d+l)/n)i/(^+i), we conclude that the expected number of cubes se- 
lected before their total volume exceeds 1 is asymptotic to , 

which gives the desired matching upper bound. I 



4 Bounds for d > 2 Dimensional Boxes 

Let 'Hd denote the unit hypercube in d > 1 dimensions. The approach of the 
last section can also be used to prove asymptotic bounds for the case of random 
boxes in Hd- 

Theorem 2. Fix d and draw n boxes independently and uniformly at random 
from Hd- The maximum number that can be packed is asymptotically bounded 
from below by l7(\/n) and from above by 0{\/ n In'^”^ n). 

Proof sketch: The lower bound argument is the same as that for cubes, except 
that Hd is partitioned into cells with sides on the order of It is easy to 

verify that, on average, there is a constant fraction of the cells in which 

each cell wholly contains at least one of the given rectangles. 

To apply Lemma 1 in a proof of the upper bound, one first conducts an 
asymptotic analysis of the distribution Fd-, the volume of a d-dimensional box, 
which shows that 

2d 



dFd{x) 



(d-1)! 



1 d — 1 — 1 

in X 
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Then, with s„ = 1/n, we obtain 

Xn ~ \J {d— l)!/(nln‘^“^ n) and Fd{xn) ~ '^/{{d — l)!n). 

which together with Lemma 1 yields the desired upper bound. I 



5 Tight Bound for d = 2 

Closing the gaps left by the bounds on E[C„] for d > 3 remains an interesting 
open problem. However, one can show that the lower bound for d > 2 is tight, i.e., 
E[C„] = To outline the proof of the bound, we first introduce 

the following reduced, discretized version. A canonical interval is an interval 
that, for some t > 0, has length and has a left endpoint at some multiple 
of 2“L A canonical rectangle is the product of two canonical intervals. In the 
reduced, rectangle-packing problem, a Poissonized model of canonical rectangles 
is assumed in which the number of rectangles of area a is Poisson distributed with 
mean Aa^, independently for each possible a. Let C*(A) denote the cardinality 
of a maximum packing for an instance of the reduced problem with parameter 
A. 

Note that there are z-l- 1 shapes possible for a rectangle of area 2“*, and that 
for each of these shapes there are 2* canonical rectangles. The mean number of 
each of these is A/2^L Thus, the total number T(A) of rectangles in the reduced 
problem with parameter A is Poisson distributed with mean 

OO OO 

J^(i + 1)2*(A2-2*) = Aj^(i + 1)2-* = 4A . (3) 

z— 0 z— 0 

To convert an instance of the original problem to an instance of the reduced 
problem, we proceed as follows. It can be seen that any interval in "Hi con- 
tains either one or two canonical intervals of maximal length. Let the canonical 
subinterval I' of an interval I be the maximal canonical interval in I, if only 
one exists, and one such interval chosen randomly otherwise. A straightforward 
analysis shows that a canonical subinterval I = [k2~'^, {k + 1)2“*) has probabil- 
ity 0 if it touches a boundary of "Hi, and has probability |2“^*, otherwise. The 
canonical suhrectangle R' of a rectangle R is defined by applying the above sepa- 
rately to both coordinates. Extending the calculations to rectangles, we get |a^ 
as the probability of a canonical subrectangle R of area a, if R does not touch the 
boundary of R 2 ^ and 0 otherwise. Now consider a random family of rectangles 
of which a maximum of C{n) can be packed in "H 2 - This family generates 
a random family of canonical subrectangles {R'i\- The maximum number C'(n) 
of the A' that can be packed trivially satisfies C{n) < C'{n). Since the number 
of each canonical subrectangle of area a that does not touch a boundary is Pois- 
son distributed with mean 9na^/4, we see that an equivalent way to generate a 
random family {R[} is simply to remove from a random instance of the reduced 
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problem with parameter 9n/4 all those rectangles touching a boundary. It follows 
easily that EC(n) < EC"(n) < EC'*(9n/4) so if we can prove that EC'*(9n/4) or 
more simply EC*(n), has the upper bound, then we are done. 

The following observations bring out the key recursive structure of maximal 
packings in the reduced problem. Let Z\ be the maximum number of rectan- 
gles that can be packed if we disallow packings that use rectangles spanning the 
height of the square. Define Zi similarly when packings that use rectangles span- 
ning the width of the square are disallowed. By symmetry, Z\ and Zi have the 
same distribution, although they may not be independent. To find this distribu- 
tion, we begin by noting that (i) a rectangle spanning the width of 'H.i and one 
spanning the height of 'H.i must intersect and hence can not coexist in a pack- 
ing; (ii) rectangles spanning the height of 'H.i are the only rectangles crossing 
the horizontal line separating the top and bottom halves of 'H.i and rectangles 
spanning the width of are the only ones crossing the vertical line separating 
the left and right halves of 'H.i. It follows that, if a maximum cardinality packing 
is not just a single 1x1 square, then it consists of a pair of disjoint maximum 
cardinality packings, one in the bottom half and one in the top half of "H 2 , or 
a similar pair of subpackings, one in the left half and one in the right half of 
'H-i- After rescaling, these subpackings become solutions to our original problem 
on 'H.’i with the new parameter A times the square of half the area of 'H. 2 ^ i-e., 
A/4. We conclude that Z\ and Z^ are distributed as the sum of two independent 
samples of C*(A/4), and that 

C*(A) < Zo + max(Zi, Z 2 ) , 

where Zg is the indicator function of the event that the entire square is one of 
the given rectangles. Note that Zg is independent of Zi and Z 2 . 

To exploit the above recursion, it is convenient to work in terms of the gen- 
erating function, S'(A) := Ee"*" One can show that S'(A) < 2e“ (S'(A/4))^ , 
and that a solution to this relation along with the inequality E[C*(A)] < 
In E[e“'"*(^)] yields the desired bound, E[C*(A)] = 
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Abstract. We consider digital trees such as (generalized) tries and PA- 
TRICIA tries, built from n random strings generated by an unbiased 
memoryless source (i.e., all symbols are equally likely). We study limit 
laws of the height which is defined as the longest path in such trees. For 
tries, in the region where most of the probability mass is concentrated, 
the asymptotic distribution is of extreme value type (i.e., double expo- 
nential distribution). Surprisingly enough, the height of the PATRICIA 
trie behaves quite differently in this region: It exhibits an exponential 
of a Gaussian distribution (with an oscillating term) around the most 
probable value k\ = [logj n+ ^2 logj n — |J-I-1. In fact, the asymptotic 
distribution of PATRICIA height concentrates on one or two points. For 
most n all the mass is concentrated at fci, however, there exist subse- 
quences of n such that the mass is on the two points fei — 1 and fci, 
or fci and fci -|- 1. We derive these results by a combination of analytic 
methods such as generating functions, Mellin transform, the saddle point 
method and ideas of applied mathematics such as linearization, asymp- 
totic matching and the WKB method. 



1 Introduction 

Data structures and algorithms on words have experienced a new wave of inter- 
est due to a number of novel applications in computer science, communications, 
and biology. These include dynamic hashing, partial match retrieval of multi- 
dimensional data, searching and sorting, pattern matching, conflict resolution 
algorithms for broadcast communications, data compression, coding, security, 
genes searching, DNA sequencing, genome maps, IP-addresses lookup on the 
internet, and so forth. To satisfy these diversified demands various data struc- 
tures were proposed for these algorithms. Undoubtly, the most popular data 
structures for algorithms on words are digital trees [9,12] (e.g., tries, PATRICIA 
tries, digital search trees), and suffix trees [6,18]. 

The most basic digital tree is known as a trie (the name comes from retrieval). 
The primary purpose of a trie is to store a set S of strings (words, keys), say 

* The work was supported by NSF Grant DMS-93-00136 and DOE Grant DE-FG02- 
93ER25168, as well as by NSF Grants NCR-9415491, NCR-9804760. 
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S = {Ail, . . . , AT„}. Each word X = X 1 X 2 X 3 ... is a finite or infinite string of 
symbols taken from a finite alphabet. Throughout the paper, we deal only with 
the binary alphabet {0,1}, but all our results should be extendable to a general 
finite alphabet. A string will be stored in a leaf of the trie. The trie over S is 
built recursively as follows: For |5| = 0, the trie is, of course, empty. For |5| = 1, 
trie{S) is a single node. If |5| > 1, 5 is split into two subsets 5o and 5i so that 
a string is in Sj if its first symbol is j £ (0, 1}. The tries trie(So) and trie{Si) 
are constructed in the same way except that at the k-th step, the splitting of 
sets is based on the /c-th symbol of the underlying strings. 

There are many possible variations of the trie. One such variation is the b- 
trie, in which a leaf is allowed to hold as many as b strings (cf. [12,18]). A second 
variation of the trie, the PATRICIA trie eliminates the waste of space caused by 
nodes having only one branch. This is done by collapsing one-way branches into 
a single node. In a digital search tree (in short DST) strings are directly stored 
in nodes, and hence external nodes are eliminated. The branching policy is the 
same as in tries. The reader is referred to [6,9,12] for a detailed description of 
digital trees. Here, we consider tries and PATRICIA tries built over n randomly 
generated strings of binary symbols. We assume that every symbol is equally 
likely, thus we are within the framework of the so called unbiased memoryless 
model. Our interest lies in establishing asymptotic distributions of the heights for 
random 6-tries, and PATRICIA tries, The height is the longest path in 
such trees, and its distribution is of considerable interest for several applications. 

We now summarize our main results. We obtain asymptotic expansions of the 
distributions Prj'H^ < k} (6-tries) and Prj'H^ < k} (PATRICIA tries) for three 
ranges of n and k. For 6-tries we consider: (i) the “right-tail region” fc — >■ 00 and 
n = 0(1); (ii) the “central region” n, fc — >■ 00 with ^ = n2“^ and 0 < ^ < 6; and 
(iii) the “left-tail region” k,n ^ 00 with n — 62^ = 0(1). We prove that most 
probability mass is concentrated in between the right tail and the central region. 
In particular, for real x 

Pr{nl<^ log 2 n + :r| ~ exp , 

where (r) = r — [r\ is the fractional part of rA In words, the asymptotic dis- 
tribution of 6-tries height around its most likely value log 2 n resembles a 
double exponential (extreme value) distribution. In fact, due to the oscillating 
term log 2 n + x) the limiting distribution does not exist, but one can find 
liminf and limsup of the distribution. 

The height of PATRICIA tries behaves differently in the central region (i.e., 
where most of the probability mass is concentrated). It is concentrated at or 
near the most likely value ki = [log 2 n -I- i/21og2 n — |J -1-1. We shall prove 
that the asymptotic distribution around k\ resembles an exponential of a Gaus- 
sian distribution, with an oscillating term (cf. Theorem 3). In fact, there exist 

^ The fractional part (r) is often denoted as (rj, but in order to avoid confusion we 
adopt the above notation. 
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subsequences of n such that the asymptotic distribution of PATRICIA height 
concentrates only on k\, or on k\ and one of the two points k\ — 1 ov k\ + \. 

With respect to previous results, Devroye [1] and Pittel [14] established the 
asymptotic distribution in the central regime for tries and 6-tries, respectively, 
using probabilistic tools. Jacquet and Regnier [7] obtained similar results by 
analytic methods. The most probable value, log 2 n, of the height for PATRICIA 
was first proved by Pittel [13]. This was then improved to log 2 n-k i/21og2 n(l -k 
o(l)) by Pittel and Rubin [15], and independently by Devroye [2]. No results 
concerning the asymptotic distribution for PATRICIA height were reported. 

The full version of this paper with all proofs can be found on http : //www . cs . 
purdue . edu/people/ spa. 

2 Summary of Results 

As before, we let and denote, respectively, the height of a 6-trie and a 
PATRICIA trie. Their probability distributions are 

6^ = Pr{Hl < k} and = Pr{H^ < k}. (1) 

We note that for tries = 0 for n > h2^ (corresponding to a balanced tree), 
while for PATRICIA tries 6^ = 0 for n > 2^. In addition, for PATRICIA we 
have the following boundary condition: 6* = 1 for A: > n. It asserts that the 
height in a PATRICIA trie cannot be bigger than n (due to the elimination of 
all one-way branches). 

The distribution of 6-tries satisfies the recurrence relation 

A:>0 (2) 

i=0 

with the initial condition(s) 

6° = 1, n = 0, 1, 2, . . . , 6; and 6° = 0, n > b. (3) 

This follows from ,'Uf^Ti} + 1, where and denote, 

respectively, the left subtree and the right subtree of sizes i and n — i, which 
happens with probability 2“”(”). Similarly, for PATRICIA tries we have 

n-l , s 

/,fc+i=2-"+i/i^i + 2-"^rUf6^„ A:>0 (4) 

i=i 

with the initial conditions 

6q = 6° = 1 and = 0, n>2. (5) 

Unlike 6-tries, in a PATRICIA trie the left and the right subtrees cannot be 
empty (which occurs with probability 2“"+^). 
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We shall analyze these problems asymptotically, in the limit n — >■ oo. Despite 
the similarity between (2) and (4), we will show that even asymptotically the 
two distributions behave very differently. 

We first consider ordinary tries (i.e., 1-tries). It is relatively easy to solve (2) 
and (3) explicitly and obtain the integral representation 



^ i (1 + 

2tti ' 



0 , 



( 2 '“ 



n>2^ 

Q<n<2^. 



2nfc(^2fc-n)! 7 — 



(6) 



Here the loop integral is for any closed circle surrounding z = 0. 

Using asymptotic methods for evaluating integrals, or applying Stirling’s for- 
mula to the second part of (6), we obtain the following. 



Theorem 1. The distribution of the height of tries has the following asymptotic 
expansions: 

(i) Right-Tail Region.- k ^ oo, n = 0(1) 

Pr{-H^ <k} = h'f = l-n{n- 1)2~’^-^ + 0{2~^’^). 

(ii) Central Region.- k,n^ oo with f = n2~'^, 0 < ^ < 1 



where 

< P { C ) = (^1 - log(l - 1, 

^(e) = (l-a-'/"- 

(iii) Left-Tail Region.- k,n ^ oo with 2^ — n = j = 0(1) 




This shows that there are three ranges of k and n where the asymptotic form of 
h^ is different. 



We next consider the “asymptotic matching” (see [11]) between the three 
expansions. If we expand (i) for n large, we obtain l — For C 0 

we have A(^) ~ 1 and ~ — C/2 so that the result in (ii) becomes 



A(^)e”‘>(«) ~ e-"«/2 = exp - 1 - (7) 

where the last approximation assumes that n,k — >■ oo in such a way that 
n^2~^ — >• 0. Since (7) agrees precisely with the expansion of (i) as n — >■ oo. 
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we say that (i) and (ii) asymptotically match. To be precise, we say they match 
the leading order; higher order matchings can be verified by computing higher 
order terms in the asymptotic series in (i) and (ii). We can easily show that the 
expansion of (ii) as ^ 1“ agrees with the expansion of (iii) as j — >■ oo, so that 

(ii) and (iii) also asymptotically match. The matching verifications imply that, 
at least to leading order, there are no “gaps” in the asymptotics. In other words, 
one of the results in (i)-(iii) applies for any asymptotic limit which has k and/or 
n large. We recall that = 0 for n > 2^ so we need only consider k > log 2 n. 

The asymptotic limits where (i)-(iii) apply are the three “natural scales” for 
this problem. We can certainly consider other limits (such as /c,n — >■ oo with 
k/n fixed), but the expansions that apply in these limits would necessarily be 
limiting cases of one of the three results in Theorem 1. In particular, if we let 
fc, n — >■ oo with k — 21 og 2 n = 0(1), we are led to 




= exp 




exp(— fc log 2 + 2 log n) ) . 



(8) 



This result is well-known (see [1,7]) and corresponds to a limiting double ex- 
ponential (or extreme value) distribution. However, according to our discus- 
sion, k = 21og2U + 0(1) is not a natural scale for this problem. The scale 
k = log 2 n + 0(1) (where (ii) applies) is a natural scale, and the result in (8) 
may be obtained as a limiting case of (ii), by expanding (ii) for ^ 0. 

We next generalize Theorem 1 to arbitrary 6, and obtain the following result 
whose proof can be found in our full paper available on http : //www . cs . purdue . 
edu/people/ spa. 



Theorem 2. The distribution of the height of b-tries has the following asymp- 
totic expansions for fixed b: 

(i) Right- Tail Region.- k ^ oo, n = 0(1).- 



Pr{nl <k} = h'^ 



1 - 



n! 



■\—kb 



(6 + l)!(n — 6 — 1)! 
(ii) Central Regime.- k,n ^ oo with f = n2~^ , D <b: 



where 

/■(C; &) = -!- log Wo + ^ (^^log(wo^) - log 6! - log ^1 - , 

H(e b) = , ^ 

Vl + (wo-l)(e-6) 

In the above, loq = wq(^; b) is the solution to 
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(iii) Left-Tail Region; fc, n — >■ oo with j = b2^ — n 

~ \/27rn-^6" exp (— (n -|- j) (l -|- b~^ log 6!)) 

where j = 0(1). 



When & = 1 we can easily show that Theorem 2 reduces to Theorem 1 since 
in this case 1) = 1/(1 ~ 0- We also can obtain ujq explicitly for 6 = 2, 

namely: 



<^0(5; 2) 



2 



(9) 



For arbitrary 6, we have wq — ^ oo as ^ 6 and wq — >■ 1 as ^ 0+. More 

precisely, 

o.o = l-|+0(^'’+i), (10) 

Wo = + ^-^ + 0(6- C), ^-)>6. (11) 

Using (10) and (11) we can also show that the three parts of Theorem 2 
asymptotically match. In particular, by expanding part (ii) as ^ 0 we obtain 



Pr{nl < fc} ~ A(e)e"'^«) 




= exp 



^i+ 62 -feh 
■ (&+!)! 



ne \ 
{b+iyj 






(12) 



This yields the well-known (see [7,14]) asymptotic distribution of 6-tries. We note 
that, for fc, n — >■ oo, (12) is 0(1) for fc — (1 -|- 1/6) log 2 n = 0(1). More precisely, 
let us estimate the probability mass of around (1 -I- 1/6) log 2 n + x where x 
is a real fixed value. We observe from (12) that 

< (1 -h 1/6) log 2 n -h a;} = Pr{"H^ < [(1 -h 1/6) log 2 n -h xj } 



where (x) is the fractional part of x, that is, (x) = x — [xj . 



Corollary 1. While the limiting distribution of the height for b-tries does not 
exist, the following lower and upper envelopes can be established 

limmfPr{nl < {1 + l/b)log^n + x} = exp ( , 
n^oo \ (L 0)\ J 

lim sup Pr{"H^ < (1 + 1/6) log 2 n + x} = exp (- ^ 

n—^oo \ l-L H“ oj! J 



for fixed real x. 
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We next turn our attention to PATRICIA tries. Using ideas of applied math- 
ematics, such as linearization and asymptotic matching, we obtain the following. 
The derivation can be found on http : //www. cs .purdue . edu/people/ spa where 
we make certain assumptions about the forms of the asymptotic expansions, as 
well as the asymptotic matching between the various scales. 



Theorem 3. The distribution of PATRICIA tries has the following asymptotic 
expansions: 

(i) Right-Tail Regime.- fc, n — >■ oo with n — k = j = 0(1), j > 2 

Pr{K <n-j} = hl~^ ~ 1 - poKj ■ n\ ■ 2-"V2+0-3/2)n^ 



where 



Kj = 



J 



Cl = ^ 

^ 2m 






n 

m—O 



1 - exp(-t2-™-i) 



t2 



— m — 1 



dz 



(14) 

(15) 



and po = rifc 2 (l - 2"^)"^ = 1.73137 . . . 

(ii) Central Regime.- /c,n — >■ oo with f = n2~^ , 0 < ^ < 1 

h’f ~ Vl + 2f^’{0 + 



We know 'T{ff) analytically only for C 0 a'nd ^ 1. In particular, for C 0 






C^o+, 



(16) 



with 



t{x) = -k 1) 4- ^ log ( 

e=o ^ 



1 - exp(-2"^-^) 



^ +^log(l-exp(-2"+^)) 

2 e=i 

( 17 ) 

n^) e"“‘. ( 18 ) 

2m£ V log2;H log2; ^ ^ 






1 f J 



-U — 

12 ' log2 V 2 12 



£ = -o 



In the above, T{-) is the Gamma function, ({■) is the Riemann zeta function, 
7 = — T'(l) is the Euler constant, and y(l) is defined by the Laurent series 
C(s) = l/(s— 1 ) 4-7 — 7 (l)(s-l)-|- 0 ((s—l)^). The function T {x) is periodic with 
a very small amplitude, i.e., |!?'(a;)| < 10“®. Moreover, for ■J 1 Ike function 
becomes 



<P{^) ^ Di + (1 -^)log(l -^) - (1 -C)(l + log 02 ) 
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where D\ = 1 + log(ATQ) and D 2 = K^K^je with 



K* 



Kl 



=.68321974..., 

00 00 , 

£—1 m —1 



1 - 2 



- 2 ' 



'‘+1 



1.2596283... 



(iii) Left-Tail Regime.- fc,n — >■ 00 with 2^ — n = M = 0(1) 



hi ^ 

” M\ ^ 

where D\ and D 2 are defined above. 

The expressions for hi in parts (i) and (iii) are completely determined. How- 
ever, the expression in part (ii) involves the function <?(^). We have not been able 
to determine this function analytically, except for its behaviors as ^ approaches 
0 or 1. The behavior of d>{^) as ^ 1“ implies the asymptotic matching of 

parts (ii) and (iii), while the behavior as ^ 0"*" implies the matching of (i) and 

(ii). As 5 — >■ 0, this behavior involves the periodic function ip{x), which satisfies 
if{x -|- 1) = ‘f{x). In part (ii) we give two different representations for ip{x); the 
latter (which involves tf'(a;)) is a Fourier series. 

Since > 0, we see that in (ii) and (iii), the distribution is exponentially 
small in n, while in (i) , 1 — hi is super-exponentially small (the dominant term 
in 1 — /i* is 2“"" /^). Thus, (i) applies in the right tail of the distribution while 
(ii) and (iii) apply in the left tail. We wish to compute the range of k where hi 
undergoes the transition from « 0 to « 1, as n — >■ 00 . This must be in the 
asymptotic matching region between (i) and (ii). We can show that Cj, defined 
in Theorem 3(i), becomes as j — >■ 00 



Ci 



i'5/2 

I f,v(a) 



exp -X 



llog'j 



2 log 2 



(19) 



where a = (log 2 j). With (19), we can verify the matching between parts (i) and 
(ii), and the limiting form of (ii) as ^ O’*" is 



hi ~ exp 






+ o “ n - 2 log 2 n - - 



= exp "^2^/®nexp (fc -I- 1.5 — log 2 n)^^ ^ 



= exp I —po • n • exp 



(fc -f 1.5 - log 2 nf + 9 + <F(log 2 n) 



(20) 

( 21 ) 



where po is defined in Theorem 3(i) and 



9 = 



1 



log2 V 2 



5T + 7(1) - TR + 



12 



log 2 
24 



= -1.022401... 
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while |<f'(log 2 n)| < 10“^. We have written (20) in terms of k and n, recalling 
that ^ = n2“^. We also have used \/l + ~ 1 as ^ 0. 

We now set, for an integer £, 



ki = 



log 2 n + v^2 log 2 n - - 



(22) 



= log 2 n + i/ 21 og 2 n - (3r, 



where 



(3n = (log 2 n + V21og2n “ ^ ^ !)• 

In terms of £ and (in, (21) becomes 



(23) 



Pr{"H^ < [log 2 n + a/ 2 log 2 n - 1.5J + £} (24) 

^ exp ^ — log2 ^ 

For 0 < /3„ < 1 and n — >■ oo the above is small for £ < 0, and it is close to one 
for £>1. This shows that asymptotically, as n — >■ oo, all the mass accumulates 
when k = k\ given by (22) with £ = 1. Now suppose /3„ = 0 for some n, or 
more generally that we can find a sequence such that n* — >■ oo as i — >■ oo but 
y^riog^Tii (log 2 Tii + y^2Tog^^ ~ I) remains bounded. Then, the expression in 
(24) would be 0(1) for £ = 0 (since /3„i/21og2 n = 0(1)). For £= I, (24) would 
then be asymptotically close to 1. Thus, now the mass would accumulate at 
two points, namely, k^ = k\ — 1 and k\. Finally, if /?„ = 1 — o(l) such that 
(1 - fin) ^/2 log2 n = 0(1), then the probability mass is concentrated on ki and 
ki + 1. 

In order to verify the latter assertions, we must either show that (in = Q for 
an integer n or that there is a subsequence rii such that ^J 2 \og( 2 ^^fin^ = 0(1). 
The former is false, while the latter is true. To prove that fin = 0 is impossible 
for integer n, let us assume the contrary. If there exists an integer N such that 

3 

log 2 n + i/21og2n - - = N, 

then 

n = 2^+5/2-V4+2At 



But this is impossible since this would require that 4 + 2N is odd. To see that 
there exists a subsequence such that Rfru) = fim a /2 log 2 rn = 0 ( 1 ), we observe 
that the function R{n) fluctuates from zero to y^2 log 2 n. We can show that if 
m = + 1 , then Rfrii) — >■ 0 as i — >■ oo. Note that this subsequence 

corresponds to the minima of R{n). 

Corollary 2. The asymptotic distribution of PATRICIA height is concentrated 
among the three points ki — 1, k\ and k\+l where k\ = [log 2 n+-\/21og2n— |J+1, 
that is, 

Pr{H^ = — 1 or fci or fci + 1} = 1 — o(l) 
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as n ^ oo. More precisely: (i) there are subsequences Ui for which Pr{"H^. = 

fci} = I — o(l) provided that 

as i ^ oo; (ii) there are subsequences Ui for which = fci — 1 or ki} = 

1 — o(l) provided that R{ni) = 0(1); (Hi) finally, there are subsequences Ui for 

which Pr{"H^^ = ki or fci + 1} = 1 — o(l) provided that -\/ 21 og 2 ni — R{ni) = 0(1). 
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Abstract. In this paper we show that the routing permntation problem 
is NP-hard even for binary trees. Moreover, we show that in the case of 
unbounded degree tree networks, the routing permutation problem is 
NP-hard even if the permutations to be routed are involutions. Finally, 
we show that the average-case complexity of the routing permntation 
problem on linear networks is n/4 -|- o(n). 
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1 Introduction 

Efficient communication is a prerequisite to exploit the performance of large 
parallel systems. The routing problem on communication networks consists in 
the efficient allocation of resources to connection requests. In this network, es- 
tablishing a connection between two nodes requires selecting a path connecting 
the two nodes and allocating sufficient resources on all links along the paths as- 
sociated to the collection of requests. In the case of all-optical networks, data is 
transmitted on lightwaves through optical fiber, and several signals can be trans- 
mitted through a fiber link simultaneously provided that different wavelengths 
are used in order to prevent interference (wavelength-division multiplexing) [4]. 
As the number of wavelengths is a limited resource, then it is desirable to estab- 
lish a given set of connection requests with a minimum number of wavelengths. 
In this context, it is natural to think in wavelengths as colors. Thus the routing 
problem for all-optical networks can be viewed as a path coloring problem: it 
consists in finding a desirable collection of paths on the network associated with 
the collection of connection requests in order to minimize the number of colors 
needed to color these paths in such a way that any two different paths sharing 
a same link of the network are assigned different colors. For simple networks, 
such as trees, the routing problem is simpler, as there is always a unique path 
for each communication request. 

This paper is concerned with routing permutations on trees by arc-disjoint paths. 
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that is, the path coloring problem on trees when the collection of connection re- 
quests represents a permutation of the nodes of the tree network. 

Previous and related work. In [1], Aumann and Rabani have shown that 
colors suffice for routing any permutation on any bounded degree net- 
work on n nodes, where j3 is the arc expansion of the network. The result of 
Aumman and Rabani almost matches the existential lower bound of ob- 

tained by Raghavan and Upfal [18]. In the case of specific network topologies, 
Gu and Tamaki [13] proved that 2 colors are sufficient to route any permutation 
on any symmetric directed hypercube. Independently, Paterson et al. [17] and 
Wilfong and Winkler [22] have shown that the routing permutation problem 
on ring networks is NP-hard. Moreover, in [22] a 2-approximation algorithm is 
given for this problem on ring networks. To our knowledge, the routing permu- 
tation problem on tree networks by arc-disjoint paths has not been studied in 
the literature. 

Our results. In Section 2 we first give some definitions and recall previous 
results. In Section 3 we show that for arbitrary permutations, the routing per- 
mutation problem is NP-hard even for binary trees. Moreover, we show that the 
routing permutations problem on unbounded degree trees is NP-hard even if 
the permutations to be routed are involutions, i.e. permutations with cycles of 
length at most two. In Section 4 we focus on linear networks. In this particular 
case, since the problem reduces to coloring an interval graph, the routing of any 
permutation is easily done in polynomial time [14]. We show that the average 
number of colors needed to color any permutation on a linear network on n ver- 
tices is n/4-|- o(n). As far as we know, this is the first result on the average-case 
complexity for routing permutations on networks by arc-disjoint paths. Finally, 
in Section 5 we give some open problems and future work. 

2 Definitions and Preliminary Resnlts 

We model the tree network as a rooted labeled symmetric directed tree T = 
(V,A), where processors and switches are vertices and links are modeled by 
two arcs in opposite directions. In the sequel, we assume that the labels of the 
vertices of a tree T on n vertices are {1,2,... , n} and are such that a postfix tree 
traversal would be exactly 1,2,... , n. This implies that for any internal vertex 
labeled by i the labels of the vertices in his subtree are less than i. Given two 
vertices i and j of the tree T, we denote by <i,j> the unique path from vertex 
i to vertex j. The arc from vertex i to its father (resp. from the father of i to i) 
(1 < t < n — 1) is labeled by (resp. i~). See Figure 1(a) for the linear network 
on n = 6 vertices rooted at vertex z = 6. We want to route permutations in 
Sn on any tree T on n vertices. Given a tree T and a vertex z we call T(z) the 
subtree of T rooted at vertex i. 

We associate with any permutation a graphical representation. To represent 
the permutation cr we draw an arrow from z to cr(z), if z cr(fi 5 that is, the path 
<z,ct(z)>, 1 < i < n. The arrow going from i to cr(z) crosses the arc j~^ if and 
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only if i is in T(j) and a{i) is not in T(j) and it crosses the arc j if and only 
if i is not in T(j) and a{i) is in T{j), 1 < j < n — 1. 



j + 2 + 



1 2 3 4 5 6 



1 ■ 2 ■ 3 ■ 4 ■ 5 ■ 




(a) (b) 

Fig. 1. (a) Labeling of the vertices and the arcs for the linear network on n = 6 vertices 
rooted at vertex i = 6. (b) representation of permutation a — (3, 1, 6, 5, 2, 4) on the 
linear network given in (a). 

Definition 1. Let T he a tree on n vertices and a he a permutation in Sn- We 
define the height of the arc i’*' (resp. height of the arc i~ ), 1 < i < n — 1, 
denoted hif{a,i) (resp. hif{a,i)), as the number of paths crossing the arc i~^ 
(resp. i~); that is, hi(i{a,i) = \{j G T{i) \ a(j) ^ T{i)}\ (resp. hf,{a,i) = \{j ^ 
m I a(j) G T{t)}\). 

Lemma 1. Let T he a tree with n vertices. For all a in Sn and for all i G 
{1, 2, . . . , n — 1}, h(f{<j, i) = hif{a, i). 

This lemma is straightforward to prove. It tells us that in order to study the 
height of a permutation on a tree on n vertices, it suffices to consider only the 
height of the labeled arcs 

Definition 2. Given a tree T and a permutation a to be routed on T, the height 
of a, denoted hT{cf), is the maximum number of paths crossing any arc of T : 
hT{cr) = maxh^{a,i) . 

I 

For example the permutation cr = (3, 1, 6, 5, 2, 4) on the linear network in Figure 
1(a) has height 2 (see Figure 1(b)). The maximum is reached in the arcs 4^. 

Definition 3. Given a tree T and a permutation a to he routed on T, the col- 
oration number of a, denoted Rt{o), is the minimum number of colors as- 
signed to the paths on T associated with a such that no two paths sharing a same 
arc ofT are assigned the same color. 

Clearly, for any permutation a of the vertex set of a tree T, we have Rt{ct) > 
hT{u). For linear networks the equality holds, because the conflict graph of 
the paths associated with a is an interval graph (see [12]). Moreover, optimal 
vertex coloring for interval graphs can be computed efficiently [14]. However, for 
arbitrary tree networks, equality does not hold as we will see in the Section 3.3. 

3 Complexity of Computing the Coloration Number 

We begin this section by showing the NP-completeness of the routing permu- 
tations problem in binary trees, and then for the case of routing involutions 
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on unbounded degree trees. Finally, we discuss some polynomial cases of this 
problem and we show, by an exemple, that in the case of binary trees having at 
most two vertices with degree equal to 3, the equality between the height and 
the coloration number of permutations does not hold. 



3.1 NP- Completeness Results 

Independently, Kumar et al. [15] and Erlebach and Jansen [6] have shown that 
computing a minimal coloring of any collection of paths on symmetric directed 
binary trees is NP-hard. However, the construction given in [15,6] does not work 
when the collection of paths represents a permutation of the vertex set of a 
binary tree. Thus, by using a reduction similar to the one used in [15,6] we 
obtain the following result. 

Theorem 1. Let a € Sn be any permutation to he routed on a symmetric di- 
rected binary tree T on n vertices, then computing Rt{o') is NP-hard. 

Sketch of the proof. We use a reduction from the ARC-COLORING problem 
[19]. The ARC-COLORING problem can be defined as follows : given a posi- 
tive integer k, an undirected cycle C„ with vertex set numbered clockwise as 
1,2,... , n, and any collection of paths F on C„, where each path <v, w> G F 
is regarded as the path beginning at vertex v and ending at vertex w again 
in the clockwise direction, does F can be colored with k colors so that no two 
paths sharing an edge of C„ are assigned the same color ? It is well known that 
the ARC-COLORING problem is NP-complete [10]. Let I be an instance of the 
ARC-COLORING problem. We construct from I an instance /' of the routing 
permutations problem on binary trees, consisting of a symmetric directed binary 
tree T and a permutation-set of paths F' on T such that F can be fc-colored if 
and only if F' can be fc-colored. Without loss of generality, we may assume that 
each edge of C„ is crossed by exactly k paths in F. If some edge of Cn is crossed 
by more than k paths, then this can be discovered in polynomial time, and it 
implies that the answer in this instance / must be “no”. If some edge [i,z+ 1] of 
C„ is crossed hy r < k paths, then we can add k — r paths of the form <i,i-\-l> 
(or <i, 1> if z = n) to F without changing its fc-colorability. 

Let B{i) C F (resp. E{i) C F) be the subcollection of paths of F beginning 
(resp. ending) at vertex i of C„, 1 < z < n. Thus, by the previous hypothesis, it 
is easy to verify that the following property holds for instance I. 

Claim. For all vertices z of C„, |i?(z)| = \E{i)\. 

Construction of the binary tree T of/': first, construct a line on 2k-\-n vertices de- 
noted from left to right by lk,h-i, ■ • ■ ,k, h,vi,V2, . ■ . , Vn,ri,r 2 , ... ,rfc. Next, 
for each vertex h (resp. rz), 1 < z < /, construct a new different line on 2/ -|- 1 
vertices denoted from left to right by 111,11^, . . . , Hi, wk,rlb, . . . , rl} (resp. 

^rl, Zrf , . . . , Irb, wvi, rrf , . . . , rrl) and add to T the arc set {{wk, k), 

{k,wli)} (resp. {{wri,ri), (ri,wri)}). Finally, for each vertex Vi, 1 < i < n, if 
|R(z)| > 1, then construct a new different line on oz = \B{i) \ — 1 vertices denoted 
by vj,vf, . . . , z;“* and add to T the arc set {{vj,Vi), (uz, 
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The construction of the permutation-set of paths F' of /' is as follows: for each 
path <i,j> G F, let bi (resp. Cj) be the first vertex of T in {vi,v},... ,w“*} 
(resp. {vj,vj, . . . , }) not already used by any path in F' as beginning- vertex 

(resp. ending- vertex) , then we consider the following two types of paths in F : 

• Type 1 : i < j. Then add to F' the path set {<bi, ej>}. 

• Type 2 : i > j. Let Vp (resp. Iq) be the first vertex of T in {ri, r 2 , . . . , r^} (resp. 
{/i, I 2 , ■ ■ ■ , Ik}) such that the arc (rp, WTp) (resp. {Iq, wlq)) of T has not be already 
used by any path in F', then add to F' the path set {<bi,rrp>, <lrp,rlq>, 
<Ug,ej>}. In addition, for each i, 1 < i < k, add to F' the following path sets : 
{<llj,rlj> : 2 < j < k} U {<rlf,llf> : 1 < s < fc} and {<lr{,rr{> : 2 < j < 
k} U {<rrf,lrf> : 1 < s < k}. The paths <U{,rlj> and <lrf,rr{>, 2 < j < k, 
1 < i < k, act as blockers. They make sure that all the three paths in F' 
corresponding to one path in F of type 2 are colored with the same color in any 
fc-coloration of F'. The other paths that we call permutation paths, are used to 
ensure that the path collection F' represents a permutation of the vertex set 
of T. In Figure 2 we present an example of this polynomial construction. By 




(a) (b) 

Fig. 2. Partial construction of I' from I, where fc = 3. 

our construction, it is easy to check that the set of paths F' on T represents a 
permutation of the vertex set of T, and that there is a fc-coloring of F if and 
only if there is a /c-coloring of F"'. □ 

In the case of unbounded degree symmetric directed trees, Caragiannis et 
al. [3] have shown that the path coloring problem remains NP-hard even if the 
collection of paths is symmetric (we call this problem the symmetric path color- 
ing problem), i.e., for each path beginning at vertex v\ and ending at vertex V 2 , 
there also exists its symmetric, a path beginning at V 2 and ending at vi. Thus, 
using a polynomial reduction from the symmetric path coloring problem on trees 
[3] we have the following result which proof is omitted for lack of space. 

Theorem 2. Let a € In be any involution to he routed on an unbounded degree 
tree T on n vertices. Then computing Rt{o') is NP-hard. 
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3.2 Polynomial Cases 

As noticed in Section 2, the coloration number associated to any permutation 
to be routed on a linear network can be computed efficiently in polynomial time 
[14]. In the case of generalized star networks, i.e., a tree network having only one 
vertex with degree greater to 2 and the other vertices with degree at most equal to 
2, Gargano et al. [11] show that an optimal coloring of any collection of paths on 
these networks can be computed efficiently in polynomial time. Moreover, in [11] 
is also showed that the number of colors needed to color any collection of paths 
on a generalized star network is equal to the height of such a collection of paths. 
Thus, based on the results given in [11] we obtain the following proposition. 

Proposition 1. Given a generalized star network G on n vertices and a permu- 
tation a € Sn to be routed on G, the coloration number Rg{ct) can be computed 
ejficiently in polynomial time. Moreover, Raicr) = ^g(ct) always holds. 



3.3 General Trees 

Given any permutation cr € to be routed on a tree T on n vertices, the 
equality between the heigth hT{u) and the coloration number Rt{<j) does not 
always hold. In Figure 3(a) we give an exemple of a permutation a G S'lo to be 
routed on a tree T on 10 vertices, which height hT{cr) is equal to 2. Moreover, 
in Figure 3(b) we present the conflict graph G associated with cr, that is an 
undirected graph whose vertices are the paths on T associated with a, and in 
which two vertices are adjacent if and only if their associated paths share a same 
arc in T. Thus, clearly the coloration number Rt{u) is equal to the chromatic 
number of G. Therefore, as the conflict graph G has the cycle C 5 as induced 
subgraph, then the chromatic number of G is equal to 3, and thus Rt{ct) = 3. 




Fig. 3. (a) A tree T on 10 vertices and a permutation a — (5, 4, 8, 2, 6, 3, 9, 10, 7, 1) to 
be routed on T. (b) The conflict graph G associated with permutation a in (a). 

The best known approximation algorithm for coloring any collection of paths 
with height h on any tree network is given in [7], which uses at most ["|/i] colors. 
Therefore it trivially also holds for any permutation-set of paths with height h 
on any tree. 

Proposition 2. Given a tree T on n vertices and a permutation a € Sn to be 
routed on T with heigth hT{<j), there exists a polynomial algorithm for coloring 
the paths on T associated with a which uses at most ]"|/it(ct)] colors. 
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4 Average Coloration Number on Linear Networks 

The main result of this section is the following: 

Theorem 3. The average coloration number of the permutations in S'„ to be 
routed on a linear network on n vertices is 

4 2 

where A = 0.99615 .... 

To prove this result, we use the equality between the height and the coloration 
number (see Section 2). Then our approach, developed in Subsections 4.1 and 
4.2, is as follows: at first we recall a bijection between permutations in Sn and 
special walks in N x N, called “Motzkin walks”, which are labeled in a certain 
way. The bijection is such that the height parameter is “preserved” . Then we 
prove Theorem 3 by studying the asymptotic behaviour of the height of these 
walks. On the other hand, we get in Subsection 4.3 the generating function of 
permutations with coloration number k, for any given k. This gives rise to an 
algorithm to compute exactly the average coloration number of the permutations 
for any fixed n. 

4.1 A Bijection between Permutations and Motzkin Walks 

A Motzkin walk of length n is a (n+l)-uple (sq, si, . . . , s„) of points in N x N 
satisfying the following conditions: 

— For all 0 < i < n. Si = (i, yt) with yi > 0; 

- 2/0 = = 0 ; 

— For all 0 < z < n, z/i+i — yi equals either 1 (North-East step), or 0 (East 
step), or —1 (South-East step); 

The height of a Motzkin walk lo is H{wi) = max {yi}- 

Labeled Motzkin walks are Motzkin walks in which steps can be labeled by 
integers. These structures are in relation with several well-studied combinatorial 
objects [8,20,21] and in particular with permutations. The walks we will deal 
with are labeled as follows: 

— each South-East step (z, yf) — >■ (z-l- 1, z/j — 1) is labeled by an integer between 
1 and yi^ (or, equivalently, by a pair of integers, each one between 1 and z/i); 

— each East step (i,yi) {i + l,yi) is labeled by an integer between 1 and 

22/i + 1- 

Let Pn be the set of such labeled Motzkin walks of length n. We recall that 
Sn is the set of permutations on [zzj. The following result was first established 
by Frangon and Viennot [9]: 

Theorem (Frangon-Viennot) There is a one-to-one correspondence between the 
elements of Pn and the elements of Sn ■ 
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Several bijective proofs of this theorem are known. Biane’s bijection [2] is par- 
ticular, in the sense that it preserves the height: to any labeled Motzkin walk of 
length n and height k corresponds a permutation in S'„ with height k (and so 
with coloration number k). We do not present here the whole Biane’s bijection; 
we just focus on the construction of the (unlabelled) Motzkin walk associated 
to a permutation, in order to show that the height is preserved. This property, 
which is not explicitely noticed in Biane’s paper, is essential for our purpose. 

Biane’s correspondence between a permutation a = (ct(1), <t(2), . . . ,cr(n)) 
and a labeled Motzkin walk uj = (sq, Si, . • . , s„) is such that, for 1 < z < n): 

— step (si_i,Si) is a North-East step if and only if cr(z) > z and > z; 

— step (sj_i,Si) is a South-East step if and only if cr(z) < i and cr“^(z) < z; 

— otherwise, step (si_i,Si) is an East step. 

Now, for any 1 < z < n, the height of point Si in u> is obviously equal to 
the number of North-East steps minus the number of South-East steps in the 
shrinked walk (sq, si, • ■ • , Si). On the other hand, we can prove easily that the 
height of arc z^ in a is equal to the number of integers 
j < i such that ct(j’) > j and cr~^(J) > j, minus the 
number of integers j < i such that < j and 

< J. This proves the property. We present in 
Figure 4 an exemple of correspondence. The above 
description permits to construct the “skeleton” of 
the permutation, in the center of the figure, given 
the Motzkin walk on the top. Then the labeling of 
the path allows to complete the permutation. This 
is described in detail in [2] and in the full version of 
this paper, in preparation. 



({• 0 

I 

o 

Fig. 4: From a walk to a 

permutation 




4.2 Proof of Theorem 3 

In [16], Louchard analyzes some list structures; in particular his “dictionary 
structure” corresponds to our labeled Motzkin walks. We will use his notation 
in order to refer directly to his article. From Bouchard’s theorem 6.2, we deduce 
the following lemma: 

Lemma 2. The height T*([rzz;]) of a random labeled Motzkin walk of length n 
after the step [nv] (v G [0, 1]) ) has the following behavior 

Y*{[nv]) - nvjl - v) ^ 

\/n 

where “=4>” denotes the weak convergence and X is a Markovian process with 
mean 0 and covariance C{s,t) = 2s^(l — s <t. 

Then the work of Daniels and Skyrme [5] gives us a way to compute the maximum 
of Y*{\nv]), that is the height of a random labeled Motzkin walk. 
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Proposition 3. The height of a random labeled Motzkin walk Y* is 

maxy*([nw]) = ^ + mJnl2 + 0(n^/®), (1) 

V 4 

where m is asymptotically Gaussian with mean E{m) ~ and 

variance a‘^{m) ~ 1/8 and X = 0.99615 .... 

In the formula (1) of the above Proposition 2, the only non-deterministic 
part is m which is Gaussian. So we just have to replace m by E{m) to prove 
Theorem 3. 

4.3 An Algorithm to Compute Exactly the Average Coloration 
Number 

We just have to look at known results in enumerative combinatorics [8,21] to 
get the generating function of the permutations of coloration number exactly 
fc, that is 

with Pq{z) = 1, Pi{z) = z-bo and P„+i(z) = {t - bn)Pn{z) - A„P„_i(z) for 
n > 1, where P* is the reprocical polynomial of P, that is P/(z) = z'^Pn{l/z) 
for n > 0. 

This generating function leads to a recursive algorithm to compute the num- 
ber of permutations with coloration number k, denoted by 

Proposition 4. The number of permutations in Sn,k follows the following re- 
currence 

f 0 if n < 2fc 

hn,k = < (^0^ if n = 2fc 

I “ P(^)f^n-i,k otherwise 

where p{i) is the coefficient of z'' in Pfj^i{z)Pf^{z). 

From this result we are able to compute the average height of a permutation as 
it is h{n) = J2k>o^^ri,k/n\. 

5 Open Problems and Future Work 

It remains open the complexity of routing involutions on binary trees by arc- 
disjoint paths. The average coloration number of permutations to be routed 
on general trees is also an interesting open problem. Computing the average 
coloration number of permutations to be routed on arbitrary topology networks 
seems a very difficult problem. 

Acknowledgements. We are very grateful to Philippe Flajolet, Dominique 
Gouyou-Beauchamps and Guy Louchard for their help. 
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1 Introduction 

1.1 Historical Context 

The Euclidean Algorithm was first documented by Euclid (320-275 BC). Knuth 
(1981), p. 318, writes: “We might call it the granddaddy of all algorithms, be- 
cause it is the oldest nontrivial algorithm that has survived to the present day. ” It 
performs division with remainder repeatedly until the remainder becomes zero. 
With inputs 13 and 9 it performs the following: 




4 = 4- 1 + 0. 



This allows us to compute the greatest common divisor (gcd) of two integers 
as the last non-vanishing remainder. In the example, the gcd of 13 and 9 is 
computed as 1. 

At the end of the 17th century the concept of polynomials was evolving. 
Researchers were interested in finding the common roots of two polynomials / 
and g. One question was whether it is possible to apply the Euclidean Algorithm 
to / and g. In 1707 Newton solved this problem and showed that this always 
works in Q[x]. 



+ 2x^ — X — 2= (^a; + ^)(2x^ — 2x — 4) + ( ^x + 4 ^ 



» 1 

2x"^ — 2x — 4 = ( 2 ^ ~ l)(4a: + 4) + 0. 

In this example / = + 2x^ — x — 2 and g = 2x^ — 2x — 4 have a greatest 

common divisor 4x + 4, and therefore the only common root is —1. In a certain 
sense the Euclidean Algorithm computes all common roots. If you only want to 
know whether / and g have at least one common root, then the whole Euclidean 
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Algorithm has to be executed. Thus the next goal was to find an indicator for 
common roots without using any division with remainder. 

The key to success was found in 1748 by Euler, and later by Bezout. They 
defined the resultant of / and g as the smallest polynomial in the coefficients of / 
and g that vanishes if and only if / and g have a common root. In 1764 Bezout was 
the first to find a matrix whose determinant is the resultant. The entries of this 
Bezout matrix are quadratic functions of the coefficients of / and g. Today we 
use the matrix discovered by Sylvester in 1840, known as the Sylvester matrix. Its 
entries are simply coefficients of the polynomials / and g. Sylvester generalized 
his definition and introduced what we now call subresultants as determinants 
of certain submatrices of the Sylvester matrix. They are nonzero if and only if 
the corresponding degree appears as a degree of a remainder of the Euclidean 
Algorithm. 

These indicators, in particular the resultant, also work for polynomials in 
Z[x]. So the question came up whether it is possible to apply the Euclidean 
Algorithm to / and g in Z[x] without leaving Z[x]. The answer is no, as illustrated 
in the example above, since division with remainder is not always defined in Z[x], 
although the gcd exists. In the example it is a; + 1. 

However, in 1836 Jacobi found a way out. He introduced pseudo-division: 
he multiplied / with a certain power of the leading coefficient of g before per- 
forming the division with remainder. This is always possible in Z[x]. So using 
pseudo-division instead of division with remainder in every step in the Euclidean 
Algorithm yields an algorithm with all intermediate results in Z[x]. 

About 40 years later Kronecker did research on the Laurent series in x~^ of 
g/ f for two polynomials / and g. He considered the determinants of a matrix 
whose entries are the coefficients of the Laurent series of g/ f. He obtained the 
same results as Sylvester, namely that these determinants are nonzero if and 
only if the corresponding degree appears in the degree sequence of the Euclidean 
Algorithm. Furthermore Kronecker gave a direct way to compute low degree 
polynomials s, t and r with sf-\-tg = r via determinants of matrices derived 
again from the Laurant series of g/f, and showed that these polynomials are 
essentially the only ones. He also proved that the polynomial r, if nonzero, 
agrees with a remainder in the Euclidean Algorithm, up to a constant multiple. 
This was the first occurrence of polynomial subresultants. 

In the middle of our century, again 70 years later, the realization of computers 
made it possible to perform more and more complicated algorithms faster and 
faster. However, using pseudo-division in every step of the Euclidean Algorithm 
causes exponential coefficient growth. This was suspected in the late 1960’s. 
Collins (1967), p. 139 writes: “Thus, for the Euclidean algorithm, the lengths 
of the coefficients increases exponentially.” In Brown & Traub (1971) we find: 
“Although the Euclidean PRS algorithm is easy to state, it is thoroughly im- 
practical since the coefficients grow exponentially. ” An exponential upper bound 
is in Knuth (1981), p. 414: “Thus the upper bound [. . . ] would be approximately 
^0.5(2.414) ^ experiments show that the simple algorithm does in fact have 
this behavior; the number of digits in the coefficients grows exponentially at 
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each step!”. However, we did not find a proof of an exponential lower bound; 
our bound in Theorem 7.3 seems to be new. 

One way out of this exponential trap is to make every intermediate result 
primitive, that is, to divide the remainders by the greatest common divisors of 
their coefficients, the so-called content. However, computing the contents seemed 
to be very expensive since in the worst case the gcd of all coefficients has to be 
computed. So the scientists tried to find divisors of the contents without us- 
ing any gcd computation. Around 1970, first Collins and then Brown & Traub 
reinvented the polynomial subresultants as determinants of a certain variant of 
the Sylvester matrix. Habicht had also defined them independently in 1948. 
Collins and Brown & Traub showed that they agree with the remainders of the 
Euclidean Algorithm up to a constant factor. They gave simple formulas to com- 
pute this factor and introduced the concept of polynomial remainder sequences 
(PRS), generalizing the concept of Jacobi. The final result is the subresultant 
PRS that features linear coefficient growth with intermediate results in Z[x]. 

Since then two further concepts have come up. On the one hand the fast 
EEA allows to compute an arbitrary intermediate line in the Euclidean Scheme 
directly. Using the fast 0(n log n log log n) multiplication algorithm of Schdnhage 
and Strassen, the time for a gcd reduces from O(n^) to 0(n log log log n) 
field operations (see Strassen (1983)). On the other hand, the modular EEA is 
very efficient. These two topics are not considered in this thesis; for further 
information we refer to von zur Gathen & Gerhard (1999), Ghapters 6 and 11. 

1.2 Outline 

After introducing the notation and some well-known facts in Section 2, we start 
with an overview and comparison of various definitions of subresultants in Sec- 
tion 3. Mulders (1997) describes an error in software implementations of an 
integration algorithm which was due to the confusion caused by the these various 
definitions. It turns out that there are essentially two different ways of defining 
them: the scalar and the polynomial subresultants. Furthermore we show their 
relation with the help of the Euclidean Algorithm. In the remainder of this work 
we will mainly consider the scalar subresultants. 

In Section 4 we give a formal definition of polynomial remainder sequences 
and derive the most famous ones as special cases of our general notion. The 
relation between polynomial remainder sequences and subresultants is exhibited 
in the Fundamental Theorem 5.1 in Section 5. It unifies many results in the 
literature on various types of PRS which can be derived as corollaries from 
this theorem. In Section 6 we apply it to the various definitions of polynomial 
remainder sequences already introduced. This yields a collection of results from 
Gollins (1966, 1967, 1971, 1973), Brown (1971, 1978), Brown & Traub (1971), 
Lickteig & Roy (1997) and von zur Gathen & Gerhard (1999). Lickteig & Roy 
(1997) found a recursion formula for polynomial subresultants not covered by the 
Fundamental Theorem. We translate it into a formula for scalar subresultants 
and use it to finally solve an open question in Brown (1971), p. 486. In Section 7 
we analyse the coefficient growth and the running time of the various PRS. 
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Finally in Section 8 we report on implementations of the various polynomial 
remainder sequences and compare their running times. It turns out that comput- 
ing contents is quite fast for random inputs, and that the primitive PRS behaves 
much better than expected. 

Much of this Extended Abstract is based on the existing literature. The 
following results are new: 

~ rigorous and general definition of division rules and PRS, 

— proof that all constant multipliers in the subresultant PRS for polynomials 
over an integral domain R are also in i?, 

— exponential lower bound for the running time of the pseudo PRS (algorithm). 

2 Foundations 

In this chapter we introduce the basic algebraic notions. We refer to von zur Ga- 
then & Gerhard (1999), Sections 2.2 and 25.5, for the notation and fundamental 
facts about greatest common divisors and determinants. More information on 
these topics is in Hungerford (1990). 

2.1 Polynomials 

Let i? be a ring. In what follows, this always means a commutative ring with 1. 

A basic tool in computer algebra is division with remainder. For given poly- 
nomials / and g in R[x] of degrees n and m, respectively, the task is to find 
polynomials q and r in i?[a;] with 

f = qg + r and degr < deg g. (2.1) 

Unfortunately such q and r do not always exist. 

Example 2.2. It is not possible to divide x'^ by 2a; -|- 3 with remainder in Z[x] 
because x“^ = (ux + v)(2x -I- 3) -I- r with u,v,r € Q has the unique solution 
u = 1/2, V = 0 and r = —3/2, which is not over Z. 

If defined and unique we call q = quo( f,g) the quotient and r = rem(/, g) 
the remainder. A ring with a length function (like the degree of polynomials) 
and where division with remainder is always defined is a Euclidean domain. i?[x] 
is a Euclidean domain if and only if i? is a field. Moreover a solution of (2.1) is 
not necessarily unique if the leading coefficient lc(^) of g is a zero divisor. 

Example 2.3. Let i? = Zg and consider / = Ax^ + 2x and g = 2x + 1. With 

qi = 2x, ri = 0 
q 2 = 2x + A, r 2 = A 

we obtain 

Qi9 + ri = 2x(2x -|- 1) -b 0 = Ax^ + 2x = f, 

929 + ?'2 = (2x + A)(2x -b 1) -b 4 = Ax^ + lOx -b 8 = Ax^ + 2x = f. 

Thus we have two distinct solutions (qi,ri) and ( 92 , ^’ 2 ) of (2.1). 
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A way to get solutions for all commutative rings is the general pseudo-division 
which allows multiplication of / by a ring element a: 

af = qg-i- r, deg r < deg g. (2.4) 

If a = then this is the (classical) pseudo-division. If lc((;) is not a zero 

divisor, then (2.4) always has a unique solution in R[x\. We call q = pquo(/, g) 
the pseudo-quotient and r = prem(/, g) the pseudo-remainder. 

Example 2.2 continued. For and 2a; + 3 we get the pseudo-division 

2^ -x^ = (2a;- 3) (2a; -k 3) -k 9 

A simple computation shows that we cannot choose a = 2. 

Lemma 2.5. 

(i) Pseudo -division always yields a solution of (2.)) in i?[x]. 

(a) If\c{g) is not a zero divisor, then any solution of (2.4-) has degq = n — m. 

Lemma 2.6. The solution (q,r) of (2.)) is uniquely determined if and only if 
lc((/) is not a zero-divisor. 

Let i? be a unique factorization domain. We then have gcd{f,g) G R for 
f,g G R[x], and the content cont(/) = gcd(/o, ...,/„) G i? of / = Y.o<j<n fj^^- 
The polynomial is primitive if cont(/) is a unit. The primitive part pp(/) is 
defined by / = cont(/) • pp(/). Note that pp(/) is a primitive polynomial. 

The Euclidean Algorithm computes the gcd of two polynomials by iterating 
the division with remainder: 



r^-^= qiri + r^+l. (2.7) 

3 Various Notions of Subresultants 

3.1 The Sylvester Matrix 

The various definitions of the subresultant are based on the Sylvester matrix. 
Therefore we first take a look at the historical motivation for this special ma- 
trix. Our goal is to decide whether two polynomials / = X)o<i<n 
g = of degree n > m > 0 over a commutative ring R 

in the indeterminate x have a common root. To find an answer for this ques- 
tion, Euler (1748) and Bezout (1764) introduced the (classical) resultant that 
vanishes if and only if this is true. Although Bezout also succeeded in finding 
a matrix whose determinant is equal to the resultant, today called Bezout ma- 
trix, we will follow the elegant derivation in Sylvester (1840). The two linear 
equations 

fnXn + fn-lXn-l H + flXl -k foXQ = 0 

grriXm + gm-lXm-1 H + 9lXl -k goXQ = 0 
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in the indeterminates xq, - ■ ■ ,Xn are satisfied if xj = for all j, where a is a 
common root of / and g. For n > 1 there are many more solutions of these two 
linear equations in many variables, but Sylvester eliminates them by adding the 
(m — 1) + (n — 1) linear equations that correspond to the following additional 
conditions: 

xf(x) = 0 , . . . , x'^~^f(x) = 0, 
xg{x) = 0 , . . . , = 0. 

These equations give a total of n + to linear relations among the variables 

^m+n— 1? * * ' 7 ^0’ 

fn^m-\-n—l fo^m—1 — 0 

fnXn + fn-lXn-1 + ’ ’ ’ + foXQ = 0 
gmXm+n—1 “t” ‘ ‘ ‘ “t” go^n—1 — 0 



gmXm “t” gm—lXra—1 “t“ * * * “t“ goXQ — 0 



Clearly Xj = gives a solution for any common root a of f and g, but the point 
is that (essentially) the converse also holds: a solution of the linear equations 
gives a common root (or factor). The (n+m) x (n+m) matrix, consisting of coef- 
ficients of / and g, that belongs to this system of linear equations is often called 
Sylvester matrix. In the sequel we follow von zur Gathen & Gerhard (1999), Sec- 
tion 6.3, p. 144, and take its transpose. 

Definition 3.1. Let R be a commutative ring and let f = X)o<i<n 
g = € R[x] he polynomials of degree n > m > 0, respectively. 

Then the (n -I- to) x {n + to) matrix 



( 



fn 



9m 



fn—1 fn 



9m— 1 9m 



\ 



Syl(/,5) = 



fn gi : 

fn-l go '. 

: 90 



fo 



fo 



9m 



V 



m 




•v' 

n 



9o / 



is called the Sylvester matrix of / and g. 
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Remark 3.2. Multiplying the {n + m — j)th row by and adding it to the 
last row for 1 < j < n + m, we get the (n + m) x (n + to) matrix S* . Thus 
det(Syl(/,g)) = det(Syr(/,g)). 

More details on resultants can be found in Biermann (1891), Gordan (1885) and 
Haskell (1892). Computations for both the univariate and multivariate case are 
discussed in Collins (1971). 

3.2 The Scalar Subresultant 

We are interested in finding out which degrees appear in the degree sequence of 
the intermediate results in the Euclidean Algorithm. Below we will see that the 
scalar subresultants provide a solution to this problem. 

Definition 3.3. Let R be a commutative ring and f = 9 ~ 

J2o<j<m 9j^^ € R[x] polynomials of degree n > m > 0, respectively. The deter- 



minant Ok{f, g) € R of the (to -I- n 


— 2k) X (to -I- n — 2k) matrix 






( f 

J n 


9m 


\ 




fn—1 fn 


9m— 1 9m 






fn-m-\-k-\-l 


fn 9k-\-l 9m 




Sk{f,g) = 










+ ... 


fm 9m-n-\-k-\-l 


■ 9m 




\ /2fc— m+1 * * * * 


fk 9‘2k-n-\-l 

V 


■ 9k J 



m—k n—k 



is called the kth (scalar) subresultant of / and g. By convention an fj or 
gj with j < 0 is zero. If f and g are clear from the context, then we write Sk 
and CTfe for short instead of Sk{f,g) and Ok{f,g). 

Sylvester (1840) already contains an explicit description of the (scalar) subre- 
sultants. In Habicht (1948), p. 104, Ok is called Nebenresultante (minor resul- 
tant) for polynomials / and g of degrees n and n — 1. The definition is also in 
von zur Gathen (1984) and is used in von zur Gathen & Gerhard (1999), Sec- 
tion 6.10, p. 169. 

Remark 3.). 

(i) S'o = Syl(/, (?) and therefore (Tq = det(S'o) is the resultant. 

(ii) am=g(fr"^. 
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(iii) Sk is the matrix obtained from the Sylvester matrix by deleting the last 2k 
rows and the last k columns with coefficients of /, and the last columns with 
coefficients of g. 

(iv) Sk is a submatrix of S'i if fc > i. 

3.3 The Polynomial Subresultant 

We now introduce two slightly different definitions of polynomial subresultants. 
The first one is from Collins (1967), p. 129, and the second one is from Brown 
& Traub (1971), p. 507 and also in Zippel (1993), Chapter 9.3, p. 150. They 
yield polynomials that are related to the intermediate results in the Euclidean 
Algorithm. 

Definition 3.5. Let R be a commutative ring, and f = X)o<j<n 9 ~ 

^Q<j<m 9j^^ G R[x] polynomials of degree n > m > 0. Let Mik = Mik{f,g) be 
thefn+m — 2k) x (n+m — 2fc) submatrix o/Syl(/, g) obtained by deleting the last 
k of the m columns of coefficients of f, the last k of the n columns of coefficients 
of g and the last 2fc + 1 rows except row (n + m — i — k), for Q < k < m and 0 < 
i < n. The polynomial Rk{f,g) = X)o<i<n ^ R[x] is called the fcth 

polynomial subresultant of / and g. Ln fact Collins (1967) considered the 
transposed matrices. Lf f and g are clear from the context, then we write Rk for 
short instead of Rk{f,g). Note that det(Mik) = 0 ifi > k since then the last row 
of Mik is identical to the (n + m — i — k)th row. Thus Rk = X)o<i<fc 

Remark 3.6. 

(i) Moo = Syl(/, 5) and therefore Rq = det(Moo) is the resultant. 

(ii) Remark 3.4(i) implies cfq = R^. 

Definition 3.7. Let R be a commutative ring and f = X)o<j<n 9 ~ 

J2o<j<m 9j^^ G R[x] polynomials of degree n > m > 0. We consider the determi- 
nant Zk{f,g) = det(M^) G R[x\ of the (n+m — 2fc) x (n+m — 2fc) matrix ob- 
tained from Mik by replacing the last row with f ,■■ ■ , /, • • • ,g). 

Table 1 gives an overview of the literature concerning these notions. There 
is a much larger body of work about the special case of the resultant, which we 
do not quote here. 

3.4 Comparison of the Various Definitions 

As in Brown & Traub (1971), p. 508, and Geddes et al. (1992), Section 7.3, 
p. 290, we first have the following theorem which shows that the definitions 
in Collins (1967) and Brown & Traub (1971) describe the same polynomial. 

Theorem 3.8. 

(i) IffJk{f,g) yf 0, then Ok{f , g) is the leading coefficient of Rk{f, g). Otherwise, 
degRkU,9) < k. 
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Definition 


Authors 


^k{f,g) = det(S'fc) e R 


Sylvester (1840), Habicht (1948) 

von zur Gathen (1984) 

von zur Gathen & Gerhard (1999) 


Rk{f,g) = Eo<i<„det(Mifc)a;' 
= Zk{f,g) = det(Mfc) e R[x] 


Collins (1967), Loos (1982) 

Geddes et al. (1992) 

Brown & Traub (1971) 

Zippel (1993), Lickteig & Roy (1997) 
Reischert (1997) 



Table 1. Definitions of subresultants 



(a) Rk{f,g) = Zk{f,g). 

Lemma 3.9. Let F he a field, f and g in F[x] he polynomials of degree n > 
m > 0, respectively, and let ri, Si and U he the entries in the ith row of the 
Extended Euclidean Scheme, so that = Sif + fig for 0 < i < £. Moreover, let 
Pi = lc(ri) and nt = degri for all i. Then 

-ZE ■ n = Rn, for 2 < i < i. 

Pi 



Remark 3.10. Let / and g be polynomials over an integral domain R, let F be 
the field of fractions of R, and consider the Extended Euclidean Scheme of / 
and g in F[x\. Then the scalar and the polynomial subresultants are in R and 
R[x\, respectively, and Lemma 3.9 also holds: 

— • rt = Rm G R[x\. 

Pi 

Note that rt is not necessarily in and pi not necessarily in R. 

4 Division Rules and Polynomial Remainder Sequences 
(PRS) 

We cannot directly apply the Euclidean Algorithm to polynomials / and g over 
an integral domain R since polynomial division with remainder in R[x\, which 
is used in every step of the Euclidean Algorithm, is not always defined. Hence 
our goal now are definitions modified in such a way that they yield a variant of 
the Euclidean Algorithm that works over an integral domain. We introduce a 
generalization of the usual pseudo-division, the concept of division rules, which 
leads to intermediate results in i?[x]. 
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Definition 4.1. Let R he an integral domain. A one-step division rule is a par- 
tial mapping 

7^: R[xf R^ 

such that for all (f,g) € def(TZ) there exist q,r € R[x] satisfying 

(i) n{f,g) = {a, 13), 

(a) af = qg j3r and deg r < deg g . 

Recall that def(7?.) C R[xY is the domain of definition of TZ, that is, the set 
of {f,g) G R[xY at which TZ is defined. In particular, TZ: def(T^) — Rf is a 
total map. In the examples below, we will usually define one-step division rules 
by starting with a (total or partial) map TZq-. R[x]'^ — R^ and then taking TZ 
to be the maximal one-step division rule consistent with TZq. Thus 

def(T^) = {(/, 5) G R[xY :3a, /3 G R, ~3q,r G R[x] 

(a,/3) = 7^o(/,ff) and (ii) holds}, 

and TZ is TZq restricted to def(T^). Furthermore (/, 0) is never in def(T^) (“you 
can’t divide by zero”), so that 

def(7^) C = i?[x] X {R[x] \ {0}). 

We are particularly interested in one-step division rules TZ with def(7?.) = T’max- 
In our examples, (0,(7) will always be in def(T^) if g yf 0. 

We may consider the usual remainder as a partial function rem: 

R[x] with rem(/, 5) = r if there exist q,r G R[x] with f = qg r and degr < 
deg g, and def(rem) maximal. Recall from Section 2 the definitions of rem, prem 
and cont. 

Example ^.2. Let / and g be polynomials over an integral domain R of degrees 
n and m, respectively, and let /„ = lc(/), gm = lc((;) yf 0 be their leading 
coefficients. Then the three most famous types of division rules are as follows: 

— classical division rule: TZ{f,g) = (1,1). 

— monic division rule: TZ{f,g) = (1, lc(rem(/, (/))). 

— Sturmian division rule: TZ{f,g) = (1,-1). 

Examples are given below. When i? is a field, these three division rules have 
the largest possible domain of definition def(7?.) = Pmax, but otherwise, it may 
be smaller; we will illustrate this in Example 4.7. Hence they do not help us in 
achieving our goal of finding rules with maximal domain Pmax- But there exist 
two division rules which, in contrast to the first examples, always yield solutions 
in i?[x]: 

— pseudo- division rule: TZ{f,g) = 

In case i? is a unique factorization domain, we have the 

— primitive division rule: TZ{f,g) = {g!ff"^~^^jCont{prem(f,g))). 
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For algorithmic purposes, it is then useful for i? to be a Euclidean domain. 

The disadvantage of the pseudo-division rule, however, is that in the Eu- 
clidean Algorithm it leads to exponential coefficient growth; the coefficients of 
the intermediate results are usually enormous, their bit length may be exponen- 
tial in the bit length of the input polynomials / and g. If i? is a UFD, we get 
the smallest intermediate results if we use the primitive division rule, but the 
computation of the content in every step of the Euclidean Algorithm seems to 
be expensive. Collins (1967) already observed this in his experiments. Thus he 
tries to avoid the computation of contents and to keep the intermediate results 
“small” at the same time by using information from all intermediate results in 
the EE A, not only the two previous remainders. Our concept of one-step division 
rules does not cover his method. So we now extend our previous definition, and 
will actually capture all the “recursive” division rules from Collins (1967, 1971, 
1973), Brown & Traub (1971) and Brown (1971) under one umbrella. 

Definition 4.3. Let R he an integral domain. A division rule is a partial map- 
ping 

associating to (f,g) € def(TZ) a sequence {{a2,!32), • ■ • , {oti+h of arbitrary 

length I such that for all (f,g) € def(TZ) there exist i € N>o, qi,. ■ ■ ,qi € i?[x] 
and To, . . . , r^+i G R[x] satisfying 

(i) tq = f,n= g, 

(a) n,{f,g) =TZ(f,g), = {ai,Pi), 

(Hi) Oiri _2 = qi-iri-i + PiXi and degr^ < degr*_i 

for 2 < i < £ -\- 1. The integer £ = \TZ{f,g)\ is the length of the sequence. 

A division rule where £ = 1 for all values is the same as a one-step division 
rule, and from an arbitrary division rule we can obtain a one-step division rule 
by projecting to the first coordinate {a 2 , P 2 ) if £ > 2. Using Lemma 2.6, we find 
that for all (/, g) G def(77.), < 7 ,_i and are unique for 2 < t < £ -|- 1. If we have 
a one-step division rule TZ* which is defined at all (ri_ 2 ,ri_i) for 2 < i < £ -\- 1 
(defined recursively), then we obtain a division rule TZ by using TZ* in every step: 

= TZ*{r,_2,ri_i) = (a,/3). 

If we truncate TZ at the first coordinate, we get TZ* back. But the notion of 
division rules is strictly richer than that of one-step division rules; for example 
the first step in the reduced division rule below is just the pseudo-division rule, 
but using the pseudo-division rule repeatedly does not yield the reduced division 
rule. 

Example ).2 continued. Let f = ro, g = ri,r 2 ,... ,r^ G i?[x] be as in Defini- 
tion 4.3, let Hi = degri be their degrees, pi = lc(ri) their leading coefficients, 
and di = Hi — rii+i G N>o for 0 < i ^ (if no > rii). We now present two 
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different types of recursive division rules. They are based on polynomial sub- 
resultants. It is not obvious that they have domain of definition 2?max, since 
divisions occur in their definitions. We will show that this is indeed the case in 
Remarks Remark 6.8 and Remark 6.12. 

— reduced division rule: TZi{f,g) = (ai,Pi) for 2 < i < £ -|- 1, 
where we set ai = 1 and 

(ai,Pi) = , ai-i) for 2 < i < £ + 1. 

— subresultant division rule: TZi{f,g) = (ai,/3i) for 2 < z < £ -|- 1, 
where we set po = 1, z />2 = — 1, ips, ■ ■ ■ , ipi+i G with 

,-pi-2'4’t'~'^) for 2 < z < £ -I- 1, 
for 3 < z < £ -I- 1. 

The subresultant division rule was invented by Collins (1967), p. 130. He tried 
to find a rule such that the rj’s agree with the polynomial subresultants up to a 
small constant factor. Brown (1971), p. 486, then provided a recursive definition 
of the ai and (3i as given above. Brown (1971) also describes an “improved 
division rule” , where one has some magical divisor of pi . 

We note that the exponents in the recursive definition of the z/ij’s in the 
subresultant division rule may be negative. Hence it is not clear that the (3i’s are 
in R. However, we will show this in Theorem 6.15, and so answer the following 
open question that was posed in Brown (1971), p. 486: 

Question 4.4. “At the present time it is not known whether or not these equa- 
tions imply ipi,Pi G R. ” 

By definition, a division rule TZ defines a sequence (tq,... ,r^) of remain- 
ders; recall that they are uniquely defined. Since it is more convenient to work 
with these “polynomial remainder sequences” , we fix this notion in the following 
definition. 

Definition 4.5. Let TZ he a division rule. A sequence (ro, ... , r^) with each ri G 
R[x] \ {0} is called a polynomial remainder sequence (PRS) for (/, g) according 
to TZ if 

(i) ro = f,ri= g, 

(a) 7^^(/,g) = (oi, A), 

(Hi) aiVi_2 = qi-iri-i -\- (drUi, 

for 2 < i < £ -\- I, where £ is the length ofTZ{f,g). The PRS is complete if 
r^+i = 0. It is called normal if di = degrz — degr^+i = 1 for 1 < i < £ — 1 
(Collins (1967), p. 128/129). 

In fact the remainders for PRS according to arbitrary division rules over an 
integral domain only differ by a nonzero constant factor. 
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Proposition 4.6. Let R he an integral domain, f,g € R[x] and r = (tq, ... ,ri) 
and r* = (rg, . . . ,r|. ) be two PRS for (f,g) according to two division rules TZ 
and TZ* , respectively, none of whose results a j , /3j , a* , /?* is zero. Then r* = jiri 
with 



7*= n 

0<fe<i/2-l 



C^i-2kf3*_2k 



GF\{0} 



for 0 <i < min{^, £*}, where F is the field of fractions of R. 



The proposition yields a direct way to compute the PRS for (/, g) according 
to TZ* from the PRS for (/, g) according to TZ and the ai, (di, a*, (}*. In particular, 
the degrees of the remainders in any two PRS are identical. 

In Example 4.2 we have seen seven different division rules. Now we consider 
the different polynomial remainder sequences according to these rules. Each PRS 
will be illustrated by the following example. 



Example 4-T. We perform the computations on the polynomials 

f = ro = 9x^ - 27x^ - 27x^ + 72x^ + 18x - 45 and 
g = r\ = “ix^ — 4a;^ — 9x + 21 



over R = Q and, wherever possible, also over R= Z. 



i 


classical 


monic 


Sturmian 


pseudo 


0 

1 


9a;® - 273;"^ - 27a;® + 72a;^ + 18a; - 45 
3x‘^ - 4x^ - 9a; + 21 


2 

3 

4 


-lla;^-27a; + 60 

164 880 1 248 931 

1331 ' 1331 

1 959 126 851 
335 622 400 


2 ,27 60 

X -r 

27659 
^ 18 320 

1 


lla;^ - 27x + 60 

164 880 1 248 931 

1331 ^ ' 1331 

1959 126 851 
335 622 400 


-297a;^ - 729x + 1620 
3 245 333 040a; - 4 899 708 873 
-1 659 945 865 306 233 453 993 



i 


primitive 


reduced 


subresultant 


0 

1 


9a;® - 27x‘^ - 27a;® + 72x^ + 18a; - 45 
3a;^ - 4a;2 - 9a; + 21 


2 


-lla;^ -27a; + 60 


-297a;^ - 729a; + 1620 


297a;^ + 729a; - 1620 


3 


18 320a; - 27 659 


120197 520X- 181470 699 


13 355 280a;- 20163 411 


4 


-1 


86 915 463129 


9 657 273 681 



1. Classical PRS. The most familiar PRS for (f,g) is obtained according 
to the classical division rule. Collins (1973), p. 736, calls this the natural 
Euclidean PRS (algorithm). The intermediate results of the classical PRS 
and of the Euclidean Algorithm coincide. 

2. Monic PRS. In Collins (1973), p. 736, the PRS for (f,g) according to the 
monic division rule is called monic PRS (algorithm). The ri are monic for 
2 < i < £, and we get the same intermediate results as in the monic Euclidean 
Algorithm in von zur Gathen & Gerhard (1999), Section 3.2, p. 47. 
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3. Sturmian PRS. We choose the PRS for (/, g) according to the Stur- 
mian division rule as introduced in Sturm (1835). Kronecker (1873), p. 117, 
Habicht (1948), p. 102, and Loos (1982), p. 119, deal with this generalized 
Sturmian PRS (algorithm). Kronecker (1873) calls it Sturmsche Reihe (Stur- 
mian sequence) , and in Habicht (1948) it is the verallgemeinerte Sturmsche 
Kette (generalized Sturmian chain), li g = df jdx as in Habicht (1948), p. 99, 
then this is the classical Sturmian PRS (algorithm). Note that the Sturmian 
PRS agrees with the classical PRS up to sign. 

If R is not a field, then Example 4.7 shows that the first three types of PRS 
may not have I^max as their domain of definition. In the example they are only 
of length 1. But fortunately there are division rules that have this property. 

4. Pseudo PRS. If we choose the PRS according to the pseudo-division rule, 
then we get the so-called pseudo PRS. Collins (1967), p. 138, calls this the 
Euclidean PRS (algorithm) because it is the most obvious generalization of 
the Euclidean Algorithm to polynomials over an integral domain R that is 
not a field. Collins (1973), p. 737, also calls it pseudo-remainder PRS. 

5. Primitive PRS. To obtain a PRS over R with minimal coefficient growth, 
we choose the PRS according to the primitive division rule which yields 
primitive intermediate results. Brown (1971), p. 484, calls this the primitive 
PRS (algorithm). 

6. Reduced PRS. A perceived drawback of the primitive PRS is the (seem- 
ingly) costly computation of the content; recently the algorithm of Cooper- 
man et al. (1999) achieves this with an expected number of less than two 
integer gcd’s. In fact, in our experiments in Section 8, the primitive PRS 
turns out to be most efficient among those discussed here. But Collins (1967) 
introduced his reduced PRS (algorithm) in order to avoid the computation of 
contents completely. His algorithm uses the reduced division rule and keeps 
the intermediate coefficients reasonably small but not necessarily as small as 
with the primitive PRS. 

7. Subresultant PRS. The reduced PRS is not the only way to keep the coef- 
ficients small without computing contents. We can also use the suhresultant 
division rule. According to Collins (1967), p. 130, this is the suhresultant 
PRS (algorithm). 

5 Fundamental Theorem on Snbresnltants 

Collins’ Fundamental Theorem on subresultants expresses an arbitrary subresul- 
tant as a power product of certain data in the PRS, namely the multipliers a and 
(3 and the leading coefficients of the remainders in the Euclidean Algorithm. In 
this section our first goal is to prove the Fundamental Theorem on subresultants 
for polynomial remainder sequences according to an arbitrary division rule TZ. 

The following result is shown for PRS in Brown & Traub (1971), p. 511, 
Fundamental theorem, and for reduced PRS in Collins (1967), p. 132, Lemma 2, 
and p. 133, Theorem 1. 
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Fundamental Theorem 5.1. Let f and g G R\x] he polynomials of degrees 
n > m > 0, respectively, over an integral domain R, let TZ he a division rule and 
(ro, ... ,re) he the PRS for (f,g) according to TZ, {ai,(3i) = TZi{f,g) the constant 
multipliers, Ui = degr^ and pi = lc(ri) for 0 < i < £, and di = Ui — rzi+i for 

o<i<e-i. 

(i) For 0 < j < ni, the jth suhresultant of (f,g) is 

/ p \ n-k-i—Tii 

^j{jt9) — { m Pi 11 I ^ j Pk-1 

2<k<i ^ 

*/ J = foi" some 1 < i < £, otherwise 0, where hi = X^ 2 <fc<i(^fc -2 ~ 
ni){nk-i - Ui). 

(a) The suhresultants satisfy for 1 < i < i the recursive formulas 
o'ni if, g) = Pi° and 

= <J^Af,g) ■ n2<fc<*+i (t)"' • 

Corollary 5.2. Let TZ he a division rule and (ro , . . . ,ri) he the PRS for (f,g) 
according to TZ, let Ui = deg for 0 < z < £ he the degrees in the PRS, and let 
0 < k <n\. Then 

ak 0 3i: k = Ui- 

6 Applications of the Fundamental Theorem 

Following our program, we now derive results for the various PRS for polynomials 
f,g & R[x] of degrees n > m > 0, respectively, over an integral domain R, 
according to the division rules in Example 4.2. 

Corollary 6.1. Let (ro , . . . ,ri) he a classical PRS and 1 < i < £. Then 

(^) <^mU,g) = {-itpf-^ n 

2<k<i 

(a) The subresultants satisfy the recursive formulas 
<^ndf,g) = Pi°, and 

If the PRS is normal, then this simplifies to: 

(Hi) cr„,(/,g) = n Pk -1 for i > 2. 

3</c<z 

(iv) The suhresultants satisfy the recursive formulas 
<rniif,g) = Pi°, and 

(Tui+Af^g) = (TnAf^g) ■ 
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The following is the Fundamental Theorem 11.13 in Gathen & Gerhard 
(1999), Ghapter 11.2, p. 307. 

Corollary 6.2. Let (rg, . . . ,re) be a monic PRS, and 1 <i < £. Then 

2<k<i 

(a) The subresultants satisfy the recursive formulas 

<^ni{f,g) = Pi°, and 

If the PRS is normal, then this simplifies to: 

(Hi) cFnfif,g) = (_i)Go+i)0+i)p*-ipdo+*-i JJ- for i > 2. 

2<k<i 

(iv) The subresultants satisfy for 1 < i < i the recursive formulas 
<^ndf,g) = pt°, and 

CTni+i(/,5) = (^riiif,g) ■ (-1)"*“ + VoPi/32---A+1- 



Corollary 6.3. Let (rg , . . . ,r^) be a Sturmian PRS, and 1 < i < £. Then 

(^) a„.(/,g) = (-l)'»+^---(— n pT-T''"- 

2<k<i 

(a) The subresultants satisfy the recursive formulas 
if, g) = , and 

If the PRS is normal, then this simplifies to: 

(Hi) CTnfif,g) = n Pk-1 for i > 2. 

3<k<i 

(iv) The subresultants satisfy the recursive formulas 

<rrHif,g)=pt°,and 
CT„,+i(/,5) = (TrHif,g) ■ 

The following corollary can be found in Collins (1966), p. 710, Theorem 1, 
for polynomial subresultants. 

Corollary 6.4. Let (rg, . . . ,re) be a pseudo PRS, and 1 < i < £. Then 
/•) ^ (t \ ( TT rifc-2-nfc-(nic-l-n»)(<tfc-2 + l) 

(V f7-„.(/,5) = (-1) V* 11 Pfe-1 

2<k<i 
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(a) The subresultants satisfy the recursive formulas 
CTni(/.5) = Pi°, and 

2<k<i-\-l 

If the PRS is normal, then this simplifies to: 

(m) = n fori>2. 

3</c<z— 1 

(iv) The subresultants satisfy the recursive formulas 
<^rnif>9) = Pi°, and 

f7-„,+i(/,ff) = cr„,(/,g) • n Pkl- 

3<fc<i+l 



Remark 6.5. If the PRS is normal, then Corollary 6.4(iii) implies that 

n pf-P = P^- 

3<fc<i-l 

Thus Cni{f,g) divides pi. This result is also shown for polynomial subresultants 
in Collins (1966), p. 711, Corollary 1. 

Since the content of two polynomials cannot be expressed in terms of our pa- 
rameters Pi and Ui, we do not consider the Fundamental theorem for primitive 
PRS. 

The following is shown for polynomial subresultants in Collins (1967), p. 135, 
Corollaries 1.2 and 1.4. 

Corollary 6.6. Let (rg, . . . ,r^) be a reduced PRS, and 1 < i < £. Then 

(^) a„.(/,g) = (-l)^‘pf-^ n ptp-^^-^'> 

2<k<i 

(a) The subresultants satisfy for the recursive formulas 
'^ni(/,g) = Pi°, and 

If the PRS is normal, then this simplifies to: 

(Hi) cr„,(/,g) = (-l)('^«+^)(*+^V* for i > 2. 

(iv) The subresultants satisfy the recursive formulas 



<rni{f,g) = pf°, and 

o-n,+i(/,ff) = crn,if,g) ■ {-lf°+'^p^+lP~P 
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Remark 6.7. We obtain from Corollary 6.6(i) 

2<k<i 

Thus cTni{f,g) divides This result can also be found in Collins (1967), 

p.l35, Corollary 1.2. 



Remark 6.8. For every reduced PRS, is in R[x] for 2 < i < f. Note that 
Corollary 6.6(iii) implies = (— So the normal case is 
clear. An easy proof for the general case based on polynomial subresultants 
is in Collins (1967), p. 134, Corollary 1.1, and Brown (1971), p. 485/486. 

Lemma 6.9. Let Cij = dj-i Y[j<k<ii^~^k), and let tpi he as in the subresultant 
division rule. Then 



V'i = - n p 7 for2<i<i. 

l<j<i-2 



Corollary 6.10. Let (rg, ... , r^) be a subresultant PRS, and 1 < i < £. Then 

0) (^m{f,9)= n Pk~^’*‘- 

l<k<i 

(a) The subresultants satisfy the recursive formulas 
CTni(/.5) = and 

CT„,+i (/, g) = (/, g) ■ pfii Pfc ■ 

l<k<i 

If the PRS is normal, then this simplifies to: 

(in) ckrii if, g) = Pi for i>2. 

(iv) The subresultants satisfy the recursive formulas 

f7ni(/,ff) = Pi°> and 
CT«i+i(/,5) = ^mU^g) ■ P^+iPi^- 

Now we have all tools to prove the relation between normal reduced and 
normal subresultant PRS which can be found in Collins (1967), p. 135, Corol- 
lary 1.3, and Collins (1973), p. 738. 

Corollary 6.11. Let (rg,... ,ri) be a normal reduced PRS and (og,... ,ag) a 
normal subresultant PRS for the polynomials rg = ag = f and ri = a\ = g. 
Then the following holds for 2 < i < £: 

lc(r,) = (_l)("o-"d("i-"d . lc(a,). 
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Remark 6.12. For every subresultant PRS the polynomials are in R[x] for 
2 < i < i. Note that Corollary 6.10(iii) implies = Ri{f,g). So the normal case 
is clear. An easy proof for the general case based on polynomial subresultants is 
in Collins (1967), p. 130, and Brown (1971), p. 486. 

Corollary 6.10 does not provide the only recursive formula for subresul- 
tants. Another one is based on an idea in Lickteig & Roy (1997), p. 12, and 
Reischert (1997), p. 238, where the following formula has been proven for poly- 
nomial subresultants. The translation of this result into a theorem on scalar 
subresultants leads us to an answer to Question 4.4. 

Theorem 6.13. Let (rg, ... ,ri) be a subresultant PRS. Then the subresultants 
satisfy for 1 < i < £ the recursive formulas 

o'ni (/, g) = Pi° and 
CT„,+i(/,g) = cr„,(/,5)i-‘^‘ ■ p%.^. 

The proof of the conjecture now becomes pretty simple: 

Corollary 6.14. Let ip 2 = —1 and ifi = for 3 < i < £. 

Then 

A = if, g) for 3 <i< 1. 

Since all subresultants are in i?, this gives an answer to Question 4.4: 

Theorem 6.15. The coefficients ipi and A of the subresultant PRS are always 
in R. 

7 Analysis of Coefficient Growth and Running Time 

We first estimate the running times for normal PRS. A proof for an exponential 
upper bound for the pseudo PRS is in Knuth (1981), p. 414, but our goal is to 
show an exponential lower bound. To this end, we prove two such bounds on 
the bit length of the leading coefficients pi in this PRS. Recall that pi = lc((/) 
and cr„i = pI°+^. 

Lemma 7.1. Suppose that {f,g) G Z[x]^ have a normal pseudo PRS. Then 

\Pi\ > IpiT "" for 3<i<£. 

Lemma 7.2. Suppose that (f,g) G 'Z[x]'^ have a normal pseudo PRS, and that 
\pi \ = 1. Then 



\Pi\>Aui cFnAf^gf' for3<i<£. 

2<k<i-2 

Theorem 7.3. Computing the pseudo PRS takes exponential time, at least 2", 
in some cases with input polynomials of degrees at most n. 
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We have the following running time bound for the normal reduced PRS 
algorithm. 

Theorem 7.4. Let ||/||oo; llffiloo < A, B = (n+ and let {tq,... ,ri) 

he the normal reduced PRS for f,g. Then the max-norm of the ri is at most 
4B^, and the algorithm uses 0{n^mlog'^ (nA)) word operations. 



Corollary 7.5. Since Corollary 6.11 shows that normal reduced PRS and nor- 
mal subresultant PRS agree up to sign, the estimates in Theorem l.J^ are also 
true for normal subresultant PRS. 

We conclude the theoretical part of our comparison with an overview of all 
worst-case running times for the various normal PRS in Table 2. The length of 
the coefficients of / and g are assumed to be at most n. The estimations that 
are not proven here can be found in von zur Gathen & Gerhard (1999). 



PRS 


time 


proven in 


classical/Sturmian 




von zur Gathen & Gerhard (1999) 


monic 


n® 


von zur Gathen & Gerhard (1999) 


pseudo 


c” with c > 2 


Theorem 7.3 


primitive 


n® 


von zur Gathen & Gerhard (1999) 


reduced / subresultant 


n® 


Theorem 7.4 



Table 2. Comparison of various normal PRS. The time in bit operations is for poly- 
nomials of degree at most n and with coefficients of length at most n and ignores 
logarithmic factors. 



8 Experiments 

We have implemented six of the PRS for polynomials with integral coefficients 
in G-l— b, using Victor Shoup’s “Number Theory Library” NTL 3.5a for integer 
and polynomial arithmetic. Since the Sturmian PRS agrees with the classical 
PRS up to sign, it is not mentioned here. The contents of the intermediate re- 
sults in the primitive PRS are simply computed by successive gcd computations. 
Gooperman et al. (1999) propose a new algorithm that uses only an expected 
number of two gcd computations, but on random inputs it is slower than the 
naive approach. All timings are the average over 10 pseudorandom inputs. The 
software ran on a Sun Sparc Ultra 1 clocked at 167MHz. 

In the first experiment we pseudorandomly and independently chose three 
polynomials f,g,hG Z[x] of degree n — 1 with nonnegative coefficients of length 
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n 



— I — pseudo — * — monic — ■ — subresultant 

— X — classical — b — reduced — e — primitive 

Fig. 1. Computation of polynomial remainder sequences for polynomials of degree n — 1 
with coefficients of bit length less than n for 1 < n < 32. 



less than n, for various values of n. Then we used the various PRS algorithms 
to compute the gcd of fh and gh of degrees less than 2n. The running times are 
shown in Figures Figure 1 and Figure 2. 

As seen in Table 2 the pseudo PRS turns out to be the slowest algorithm. 
The reason is that for random inputs with coefficients of length at most n the 
second polynomial is almost never monic. Thus Theorem 7.3 shows that for ran- 
dom inputs the running time for pseudo PRS is mainly exponential. A surprising 
result is that the primitive PRS, even implemented in a straightforward man- 
ner, turns out to be the fastest PRS. Collins and Brown & Traub only invented 
the subresultant PRS in order to avoid the primitive PRS since it seemed too 
expensive, but our tests show that for our current software this is not a problem. 

Polynomial remainder sequences of random polynomials tend to be normal. 
Since Corollary 6.11 shows that reduced and subresultant PRS agree up to signs 
in the normal case, their running times also differ by little. 

We are also interested in comparing the reduced and subresultant PRS, so 
we construct PRS which are not normal. To this end, we pseudorandomly and 
independently choose six polynomials f,fi,g,gi,h,hi for various degrees n as 
follows: 
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— * — monic — ■ — subresultant 

— B — reduced — e — primitive 

Fig. 2. Computation of polynomial remainder sequences for polynomials of degree n — 1 
with coefficients of bit length less than n for 32 < n < 96. Time is now measured in 
minutes. 



polynomial 


degree 


coefficient length 


f.9 


nj6 


n/4: 


/i)5i 


njZ 


n 


h 


n!2 




hi 


n 


n 



So the polynomials 

F={fh-x’^ + h)hi 
G={gh- + gi)hi 



have degrees less than 2n with coefficient length less than n, and every polyno- 
mial remainder sequence of F and G has a degree jump of | at degree 2n — 
Then we used the various PRS algorithms to compute the gcd of F and G. The 
running times are illustrated in Figures Figure 3 and Figure 4. 

As in the first test series the pseudo PRS turns out to be the slowest, and 
the primitive PRS is the fastest. Here the monic PRS is faster than the reduced 
PRS. Since the PRS is non-normal, the a^’s are powers of the leading coefficients 
of the intermediate results, and their computation becomes quite expensive. 
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n 



— I — pseudo — * — monic — ■ — subresultant 

— X — classical — b — reduced — e — primitive 

Fig. 3. Computation of non-normal polynomial remainder sequences for polynomials 

of degree 2n — 1 with coefficient length less than n and a degree jump of ^ at degree 

2n- f, for 1 < n < 32. 
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Abstract. We develop a general framework for the analysis of algorithms of a 
broad Fuclidean type. The average-case complexity of an algorithm is seen to be 
related to the analytic behaviour in the complex plane of the set of elementary 
transformations determined by the algorithms. The methods rely on properties of 
transfer operators suitably adapted from dynamical systems theory. As a conse- 
quence, we obtain precise average-case analyses of four algorithms for evaluating 
the Jacobi symbol of computational number theory fame, thereby solving con- 
jectures of Bach and Shallit. These methods provide a unifying framework for 
the analysis of an entire class of gcd-like algorithms together with new results 
regarding the probable behaviour of their cost functions. 

1 Introduction 

Euclid’s algorithm, discovered as early as 300BC, was analysed first in the worst case 
in 1733 by de Lagny, then in the average-case around 1969 independently by Heilbronn 
[8] and Dixon [5], and finally in distribution by Hensley [9] who proved in 1994 that 
the Euclidean algorithm has Gaussian behaviour; see Knuth’s and Shallit’s vivid ac- 
counts [12,20]. The first methods used range from combinatorial (de Lagny, Heilbronn) 
to probabilistic (Dixon). In parallel, studies by Levy, Khinchin, Kuzmin and Wirsing 
had established the metric theory of continued fractions by means of a specific density 
transformer. The more recent works rely for a good deal on transfer operators, a far- 
reaching generalization of density transformers, originally introduced by Ruelle [17,18] 
in connection with the thermodynamic formalism and dynamical systems theory [1]. 
Examples are Mayer’s studies on the continued fraction transformation [14], Hensley’s 
work [9] and several papers of the author [21,22] including her analysis of the Binary 
GCD Algorithm [23]. 

In this paper, we provide new analyses of several classical and semi-classical variants 
of the Euclidean algorithm. A strong motivation of our study is a group of gcd-like 
algorithms that compute the Jacobi symbol whose relation to quadratic properties of 
numbers is well-known. 

Methods. Our approach consists in viewing an algorithm of the broad gcd type as a dy- 
namical system, where each iterative step is a linear fractional transformation (LET) of 
the form z— >■ (az + b)/(cz+d). The system control may be simple, what we call generic 
below, but also multimodal, what we call Markovian. A specific set of transformations 
is then associated to each algorithm. It will appear from our treatment that the compu- 
tational complexity of an algorithm is in fact dictated by the collective dynamics of its 
associated set of transformations. More precisely, two factors intervene: (f) the charac- 
teristics of the let’s in the complex domain; (ii) their contraction properties, notably 
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near fixed points. There results a classification of gcd-like algorithms in terms of the 
average number of iterations: some of them are “fast", that is, of logarithmic complexity 
6>(log N), while others are “slow", that is, of the log-squared type 0(log^ N). 

It is established here that strong contraction properties of the elementary transforma- 
tions that build up a gcd-like algorithm entail logarithmic cost, while the presence of 
an indifferent fixed-point leads to log-squared behaviour. In the latter case, the analysis 
requires a special twist that takes its inspiration from the study of intermittency phe- 
nomena in physical systems that was introduced by Bowen [2] and is nicely exposed in 
a paper of Prellberg and Slawny [15]. An additional benefit of our approach is to open 
access to characteristics of the distribution of costs, including information on moments: 
the fast algorithms appear to have concentration of distribution — the cost converges in 
probability to its mean — while the slow ones exhibit an extremely large dispersion of 
costs. 

Technically, this paper relies on a description of relevant parameters by means of gener- 
ating functions, a by now common tool in the average-case of algorithms [7]. As is usual 
in number theory contexts, the generating functions are Dirichlet series. They are first 
proved to be algebraically related to specific operators that encapsulate all the important 
informations relative to the “dynamics" of the algorithm. Their analytical properties de- 
pend on spectral properties of the operators, most notably the existence of a “spectral 
gap” that separates the dominant eigenvalue from the remainder of the spectrum. This 
determines the singularities of the Dirichlet series of costs. The asymptotic extraction of 
coefficients is then achieved by means of Tauberian theorems [4], one of several ways 
to derive the prime number theorem. Average complexity estimates finally result. The 
main thread of the paper is thus adequately summarized by the chain: 

Euclidean algorithm Associated transformations Transfer operator 

Dirichlet series of costs Tauberian inversion Average-case complexity. 

This chain then leads to effective and simple criteria for distinguishing slow algorithms 
from fast ones, for establishing concentration of distribution, for analysing various cost 
parameters of algorithms, etc. The constants relative to the sloZw algorithms are all 
explicit, while the constants relative to the fast algorithms are closely related to the 
entropy of the associated dynamical system: they are computable numbers; however, 
except in two classical cases, they do not seem be related to classical constants of 
analysis. 



Motivations. We study here eight algorithms: the first four algorithms are variations of 
the classical Euclidean algorithm and are called Classical (G), By-Excess (L), Classical- 
Centered (K), and Subtractive (T). The last four algorithms serve to compute the Jacobi 
symbol introduced in Section 2, and are called Even (E), Odd (O), Ordinary (U) and 
Centered (C). 

The complexity of the first four algorithms is now known: The two classical algorithms 
(G) and (K) have been analysed by Heilbronn, Dixon and Rieger [16]. The Subtractive 
algorithm (T) was studied by Knuth andYao [25], and Vardi [24] analysed the By-Excess 
Algorithm (L) by comparing it to the Subtractive Algorithm. The methods used are rather 
disparate, and their applicability to new situations is somewhat unclear. Here, we design 
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a unifying framework that also provides new results on the distribution of costs. 

Two of the Jacobi Symbol Algorithms, the Centered (C) and Even (E) algorithms, have 
been introduced respectively by Lebesgue [13] in 1847 and Eisenstein [6] in 1844. Three 
of them, the Centered, Ordinary and Even algorithms, have been studied by Shallit [19] 
who provided a complete worst-case analysis. The present paper solves completely a 
conjecture of Bach and Shallit. Indeed, in [19], Shallit writes: “Bach has also sug- 
gested that one could investigate the average number of division steps in computing 
the Jacobi symbol [...]. This analysis is probably feasible to carry out for the Even 
Algorithm, and it seems likely that the average number of division steps is 6>(log^ N). 
However, determining the average behaviour for the two other algorithms seems quite 
hard.” 

Results and plan of the paper. Section 3 is the central technical section of the paper. 
There, we develop the line of attack outlined earlier and introduce successively Dirichlet 
generating functions, transfer operators of the Ruelle type, and the basic elements of 
Tauberian theory that are adequate for our purposes. The main results of this section 
are summarized in Theorem 1 and Theorem 2 that imply a general criterion for loga- 
rithmic versus log-squared behaviour, while providing a framework for higher moment 
analyses. 

In Section 4, we return to our eight favorite algorithms — four classical variations and four 
Jacobi symbol variations. The corresponding analyses are summarized in Theorems 3 
and 4 where we list our main results, some old and some new, that fall as natural 
consequences of the present framework. It results from the analysis (Theorem 3) that 
the East Class contains two classic algorithms, the Classical Algorithm (G), and the 
Classical Centered Algorithm (K), together with three Jacobi Symbol algorithms: the 
Odd (O), Ordinary (U) and Centered (C) Algorithms. Their respective average-case 
complexities on pairs of integers less than N are of the form Hm ^ Ah \ogN for 
H € {G,K,0,U,C}. 

The five constants are effectively characterized in terms of entropies of the associated 
dynamical system, and the constants related to the two classical algorithms are easily 
obtained, Ac = (12/7 t^) log 2, = (12/7 t^) log 0. 

Theorem 4 proves that the Slow Class contains the remaining three algorithms, the By- 
Excess Algorithm (L), the Subtractive Algorithm (T), and one of the Jacobi Symbol 
Algorithm, the Even Algorithm (E). They all have a complexity of the log-squared type, 
Ln - (3/7t2) log' N, Tn - (6/7t2) log' N, Eh ~ (2/^') log' N. 

Einally, Theorem 5 provides new probabilistic characterizations of the distribution of 
the costs: in particular, the approach applies to the analysis of the subtractive GCD 
algorithms for which we derive the order of growth of higher moments, which appears 
to be new. We also prove that concentration of distribution holds in the case of the the 
five fast algorithms (G, Tf, O, U, C). 

Einally, apart from specific analyses, our main contributions are the following: 

(a) We show how transfer operator method may be extended to cope with complex 
situations where the associated dynamical system may be either random or Markovian 
(or both!). 
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(b) An original feature in the context of analysis of algorithms is the encapsulation of 
the method of inducing (related to intermittency as evoked above). 

(c) Our approch opens access to information on higher moments of the distribution of 
costs. 

2 Eight Variations of the Euclidean Algorithm 

We present here the eight algorithms to be analysed; the first four are classical variants 
of the Euclidean Algorithm, while the last four are designed for computing the Jacobi 
Symbol. 

2.7. Variations of the classical Euclidean Algorithm. There are two divisions between 
u and V (v > u), that produce a positive remainder r such that 0 < r < m: the classical 
division (by-default) of the form v = cu + r, and the division by-excess, of the form 
V = cu — r. The centered division between u and v (v > u), of the form v = cu + er, 
with e = ±1 produces a positive remainder r such that 0 < r < uj2. There are three 
Euclidean algorithms associated to each type of division, respectively called the Classical 
Algorithm (G), the By-Excess Algorithm (L), and the Classical Centered Algorithm (K). 
Finally, the Subtractive Algorithm (T) uses only subtractions and no divisions, since it 
replaces the classical division u = cu -fr by exactly c subtractions of the form w := v — u. 



2.2. Variations for computing Jacobi symbol. The Jacobi symbol, introduced in [1 1], is 
a very important tool in algebra, since it is related to quadratic characteristics of modular 
arithmetic. Interest in its efficient computation has been reawakened by its utilisation in 
primality tests and in some important cryptographic schemes. 

For two integers u and v (v odd), the possible values for the Jacobi symbol J{u,v) 
are — 1, 0, -fl. Even if the Jacobi symbol can be directly computed from the classical 
Euclidean algorithm, thanks to a formula due to Hickerson [10], quoted in [24], we are 
mainly interested in specihc algorithms that run faster. These algorithms are fundamen- 
tally based on the following two properties. 

Quadratic Reciprocity law: J{u, v) = (— J(u, u) for u,v > 0 odd. 
Modulo law: J(v,u) = J(v — bu,u), 

and they perform, like the classical Euclidean algorithm, a sequence of Euclidean-like 
divisions and exchanges. However, the Quadratic Reciprocity law being only true for odd 
integers, the standard Euclidean division has to be transformed into a pseudo-euclidean 
division of the form 

V = bu + e2^s with e = ±1, s odd and strictly less than u, 
that creates another pair (s, u) for the following step. Then the symbol J(u, v) is easily 
computed from the symbol J{s, u). 

The binary division, used in the Binary GCD algorithm, can also be used for computing 
the Jacobi symbol. However, it is different since the pseudo-division that it uses is NOT 
a modification of the classical euclidean division. We consider here four main algorithms 
according to the kind of pseudo-euclidean division that is performed. They are called 
the Even, Odd, Ordinary and Centered Algorithms, and their inputs are odd integers. The 
Even algorithm (E) performs divisions with even pseudo-quotients, and thus odd pseudo- 
remainders. The Odd algorithm (O) performs divisions with odd pseudo-quotients, and 
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thus even pseudo-remainders from which powers of 2 are removed. The Ordinary (U) and 
Centered (C) Algorithms perform divisions where the pseudo-quotients are equal to the 
ordinary quotients or to the centered quotients; then, remainders may be even or odd, and, 
when they are even, powers of two are removed for obtaining the pseudo-remainders. 



Alg., Type 


Division 


LFT’s 


Conditions. 


(G) (1, all, 0) 


V = cu + r 
0 < r < u 


IV + 


T:c>2 


(L) (1, all, 1) 


V = CU — r 

0 < r < u 


I 

c — X 
c> 2 


T:c>S 


(K)(|,all, 0) 


V = CU + er 
c > 2, e = ±1, 
0 < r < ^ 


1 

c + ex 

C > 2,£ = ±1, 
(c,£)/(2,-1) 


T:e=l 


(T) (1, all, 0) 


V = u + {v — u) 


1 + X 1+ X 


Finishes with pq 


(E) (1, odd, 1) 


V = cu + es 
c even, e = ± 1 
s odd, 0 < s < u 


1 

C -I- EX 

c even, e = ± 1 


T:e=l 


(O) (1, odd, 0) 


V = cu + a 

c odd, £ = ±1, s odd 
fc > 1,0 < 2''s < u 


2'“ 

C -I- EX 

k >l,c odd > 2*^ -|- 1 


O 

II 


(U) (1, odd, 0) 


V = CU + s 
s = 0 or s odd, fc > 0 
0<2'°s <u 


Wo = {-|— , c> 1} 
c -1- X 

Wi = {^, >l,c>2''} 
c-l-x 

Ui\j = Uj n {c = i mod 2} 


initial state: 0 
final state: 1 


(C)(|,odd, 0) 


V = CU + e2^ a 
s = 0 or s odd, k > 0 
0 < 2'“s < f 


Co — { },£ — ±1, 

c -1- EX 

c> 2 ,(c,e) / (2,-1) 

r\k 

Cl = { },k > 1,£ = ±1 

C -1- EX 

c>2'=+\(c,e)#(2'=+\-1) 
= Cj n {c = i mod 2} 


initial state: 0 
final state: 1 



2.3. The sets of linear fractional transformations. When performing ^ (pseudo)- 
euclidean divisions on the input (u,v), each of the eight algorithms builds a specific 
continued fraction of height £ that decomposes the rational x = u/v as 
u/v = hi o h 2 o ... o h({a), 

where the hfs are linear fractional transformations (LFT’s) and a is the last value of the 
rational. The value a equals 1 for the Even Algorithm (E) and By-Excess Algorithm (L), 
and equals 0 for the other six algorithms. The rational inputs of all the algorithms always 
belong to a basic interval of the form I = [0,p] with p = 1/2 for the two centered 
algorithms (K) and (C) and p = 1 in the other six cases. For the first four algorithms, the 
valid inputs are all the rationals of I, while the valid inputs of the last four algorithms 
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are only the odd rationals of X. The variable valid has two possible values {all, odd}, 
and finally, the type of the algorithm is defined as the value of the triple (p, valid, o). 
The precise form of the possible LFT’s depends on the algorithm, and there are two 
classes of algorithms, the generic class and the Markovian class: 

In the case of the first six algorithms, there may exist special sets of LFT’s in the initial 
step (J7) or in the final step (T). However, all the other steps are generic, in the sense 
that they use the same set of LFT’s, that we call the generic set. These algorithms are 
called themselves generic. 

On the contrary, the last two algorithms -the Ordinary Algorithm (U) and the Centered 
Algorithm (C)- have a Markovian flavour. If the quotient b is odd, then the remainder 
is even, and thus k satisfies k > 1; if 6 is even, then the remainder is odd, and thus k 
satisfies k = 0. This link is of Markovian type, and we consider two states: the 0 state, 
which means “the remainder of {u, v) is odd", i.e. k = Q, and the 1 state, which means 
“the remainder of {u, v) is even ", i.e. k>\. Denoting by Uj, resp. Cj the set of LFT’s 
which can be used in state j, we obtain four different sets, k(i\j, resp. Cqj, each of them 
brings rationals from state j to state i. The initial state is the 0 state and the final state is 
the 1 state. 

3 Dynamical Operators and Tauberian Theorems 

Here, we describe the general tools for analysing algorithms of the Euclidean type 
that are based on some division-like operation and exchanges. We first introduce the 
generating functions relative to the height of the continued fraction and we relate them 
to the dynamical operator associated to the algorithm. This operator can be generic or 
Markovian, according to the structure of the algorithm. In this way, the two Dirichlet 
series that intervene in the analysis, called F{s) and G{s), are expressed in terms of 
the Ruelle operator. The average number of steps involves partial sums of coefficients 
of these two Dirichlet series, and Tauberian Theorems are a classical tool that transfers 
analytical properties of Dirichlet series into asymptotic behaviour of their coefficients. 

3.1. Generating functions. We consider the following sets relative to I := [0, p], 

Q := |(t6, f); M, V valid, {u/v) G I}, 17at := |(t6, f) G 17, w < N}, 

17 := |(m, w); rt, r; valid, gcd(M,r;) = 1, (u/v) G I}, On '■= & 0,v < N}, 

for the possible inputs of an algorithm, and we denote by 17 1^1, 0^^\ 17]^^, l7j^^ the 
subsets of 0,0, On, On for which the algorithm performs exactly 1 pseudo-divisions. 
Equivalently, the height of the continued fraction is equal to 1. We study the average 
number of steps Sn, Sn of the algorithm on On, On 

Sn-.= -^^Y.^\^n\ (1) 

I ^1 £>Q |l^Af| ^>Q 

and we wish to evaluate their asymptotic behaviour (for N oo). We first consider 
pairs {u,v) with fixed v = n, and we denote by (resp. the number of such 
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elements of (resp. We introduce the double generating functions S{s, w) and 
S{s, w) of the sequences and 



e>i 



n>l 



[^] 






i>l 



n>l 






( 2 ) 



The Riemann series C relative to valid numbers C(s) := X)«vaiid * relates the two 
generating functions via the equality S{s,w) = ({s)S{s,w).ltis then sufficient to study 
S{s, w). We introduce the two sequences (a„) and (6„) together with their associated 
Dirichlet series F{s), G{s), 



-E' 

i>i 






F{s) = J2 



£>1 



n>l 



G'(s) = E^- ^3) 



n>l 



Now, F{s) and G(s) can be easily expressed in terms of S{s, w) since 

-?"(«) = 'S'(s,l), G(s) = ^S'(s,w)U=i, 
and intervene, via partial sums of their coefficients, in the quantity Sn, 

o _ 'llnKN 'l2e>o ^ _ X) n<N 

” xE^' 



(4) 



(5) 



3.2. Ruelle operators. We show now how the Ruelle operators associated to the algo- 
rithms intervene in the evaluation of the generating function S{s, w). We denote by C 
a set of LFT’s. For each h G C, D[h] denotes the denominator of the linear fractional 
transformation (LFT) h, defined for h{x) = {ax + b)/ (ex + d) with a, b, c, d coprime 
integers by D[h]{x) := \cx + d| = |detft.|^/^ \h'{x)\~^^‘^ . The Ruelle operator Lg 
relative to the set C depends on a complex parameter s 

L^/](a:) := ^ - foh{x). (6) 

^^D[h]{xY 

More generally, when given two sets of LFT’s, £ and /C, the set CK. is formed of all 
hog with h € C and g G 1C, and the multiplicative property of denominator D, i.e., 
D[hog] {x) = D[h] {g{x)) D[g] {x), implies that the operator Kg o Lg uses all the LFT’s 
ofOC 

Kg o Lg [/] (a;) := ^ ] f o h{x) . (7) 



3.3. Ruelle operators and generating functions. The first six algorithms are generic, 
since they use the same set TL at each generic step. In this case, the f-th iterate of 
Hg generates all the LFT’s used in £ (generic) steps of the algorithm. The last two 
algorithms are Markovian. There are four sets of LFT’s, and each of these sets, denoted 
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hy Ui\j “brings" rationals from state j to state i. We denote by the Ruelle operator 
associated to set Uiy, and by the “matrix operator" 



U, 



^Us_0|0 Us_0|l\ 

\Us4|o Us i|i J 



( 8 ) 



By multiplicative properties (7), the f-th iterate of Us generates all the elements of 
, i.e., all the possible LFT’s of height More precisely, the coefficient of index 
(i, j) of the matrix Uf is the Ruelle operator relative to the set that brings rationals 
from state j to state i in f steps. 

In both cases, the Ruelle operator is then a “generating" operator, and generating func- 
tions themselves can be easily expressed with the Ruelle operator: 

Proposition 1. The double generating function S{s, w) of the sequence can be 
expressed as a function of the Ruelle operators associated to the algorithm. In the generic 
case, 

S{s, w) = wKs[l](a) + w'^Fg o (I — iwHs)”^ o Js[l](a). 

Here, the Ruelle operators Hs, Fs, Js, Ks are relative to the generic set TL, final set T , 
initial set J or mixed setK. J C\ T; the value a is the final value of the rational u/v. 

In the Markovian case, the Ruelle operator XJ g is a matrix operator, and 



S{s, w) 



(0 l)u;Us(/ 



u>Us)-^ 



1 

0 



( 0 ). 



In both cases, the Dirichlet series F{s) and G{s) involve powers of quasi-inverse of the 
Ruelle operator of order 1 forF{s), and of order 2 for G{s). 

3.4. Tauberian Theorems. Finally, we have shown that the average number of steps Sn 
of the four Algorithms on 12^ is a ratio where the numerators and the denominators 
involve the partial sums of the Dirichlet series F{s) and G(s). Thus, the asymptotic 
evaluation of Sn, Sn (for N oo) is possible if we can apply the following Tauberian 

theorem [4] to the Dirichlet series F{s), ({s)F{s), G(s), ((s)G(s). 

Tauberian Theorem. [Delange] LetF{s) be a Dirichlet series with non negative coeffi- 
cients such that F{s) converges for^t{s) > a > 0. Assume that (i) F{s) is analytic on 
5i(s) = (T, s ^ a, and (ii) for some /3 > 0, one has F{s) = ^(s)(s — G{s), 

where A, G are analytic at a, with A(a) ^ 0. Then, as N ^ oo, a„ = 

n<N 

art^h) e{N)^0. 

In the remainder of the paper, we show that the Tauberian Theorem applies to F{s) , G{s) 
with cr = 2. For F{s), it applies with (3=0. For G{s), it applies with (3 = I or (3 = 2. 
For the slow algorithms, (3 equals 2, and the average number of steps will be of order 
log^ N. For the fast algorithms, (3 equals 1, and this will prove the logarithmic behaviour 
of the average number of steps. First, the function F{s) is closely linked to the (( function 
relative to valid numbers. Then, the Tauberian Theorem applies to F{s) and C,{s)F{s), 
with a = 2 and (3 = 0. 
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3.5. Functional Analysis. Here, we consider the following conditions on a set % of LFTs 
that will entail that the Ruelle operator relative to % fulfills all the properties that 
we need for applying the Tauberian Theorem to Dirichlet series G{s). 



Conditions Q('H). There exist an open diskV C T and a real a > 2 such that 
(Cl) Every LFT h G TL has an analytic continuation on V, and h maps the closure V of 
disk V inside V. Every function \ h' \ has an analytic continuation on V denoted by h. 

(C 2 ) Eor each h G TL, there exists S{h) < 1 for which 0 < \h{z)\ < 6{h) for all z G V 

5{h) |s/2 



and such that the series 



hen 



det{h) ' 



converges on the plane 5ft(s) > a. 



(C 3 ) LetTL[k] denote the subset ofTL defined asTL[k] ■= {h G H \ det(h) = 2 *}. One 
of two conditions (i) or (ii) holds: (i) TL — TL[q\, 

{a) Eor any k > l,TL[k] is not empty and TL — Ufc>i^[fe]- 

Moreover, for any k > 0 for which TL[k] ^ 0, the intervals h G TL[k]) form a 

pseudo-partition of I. 

(C 4 ) Eor some integer A, the set TL contains a subset T> of the form 
V ■.= {h\ h{x) = AjlpT- x) with integers c -G 00 }. 



We denote by Hg a Ruelle operator associated to a generic set TL, and by Ug a Ruelle 
Markovian operator (associated to a Markovian process with two states). In this case, 
the subset ITi denotes the set relative to state i. Here I denotes the basic interval [0, p]. 
We consider that conditions Q{TL) and Q{Ui) (for i = 0, 1) hold. Then, we can prove 
the following: Under conditions (Ci) and (C 2 ), the Ruelle operator Hg acts on -4oo(V) 
for 5ft(s) > cr and the operator Ug acts on ^ 00 (V)^ for 5ft(s) > a. They are compact 
(even nuclear in the sense of Grothendieck). Furthermore, for real values of parameter s, 
they have dominant spectral properties: there exist a unique dominant eigenvalue A(s) 
positive, analytic for s > a, a unique dominant eigenfunction ipg , and a unique dominant 
projector Cg such that es['<ps\ = 1- Then, there is a spectral gap between the dominant 
eigenvalue and the remainder of the spectrum. Under conditions (C 3 ), the operators H 2 , 
U 2 aredensity transformers; thus one has A (2) = land 62 )/] = fg /(tjdf (generic case) 

or e 2 [/o, /i] = /o^[/o(i) + fi(t)]dt (Markovian case). Finally, condition (C 4 ) implies 
that the operators Hg, Ug have no eigenvalue equal to 1 on the line 5ft(s) = 2, s 2. 
Finally, the powers of the quasi-inverse of the Ruelle operator which intervene in the 
expression of generating functions F(s) and G(s) fulfill all the hypotheses of Tauberian 
Theorem: 

Theorem 1. Let TL be a generic set that satisfies Q{TL) and 14 be a Markovian set 
that satisfies Q{l4i) (i = 0, 1). Then, for any p > 1, the p-th powers (/ — Hg)”^’, 
(/ — Ug)“P of the quasi-inverse of the Ruelle operators relative to TL andU areanalytic 
on the punctured plane > 2, s 2} and have a pole of order p at s = 2. Near 

s = 2, one has, for any function f positive on J , and any x G J , 

{I-Hg)-P[f]{x) or (I-Vs)-P[f]{x)^ 

Here, A(s) is the dominant eigenvalue, ipg is the dominant eigenfunction and eg the 
dominant projector with the condition eg[i/lg] = 1. 
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Then the Euclidean Algorithm associated to TL or to U performs an average number 
of steps on valid nationals of T with denominator less than N that is asymptotically 
logarithmic, Hjq ~ Afj \ogN, C/^v ~ Au logN. The constants A h, Ajj 
involve the entropy of the associated dynamical systems. 

In the case when the set TL is only almost well-hehaved -it contains one “bad" LFT p, but 
the set Q := "H \ {p} is well-behaved- we adapt the method of inducing that originates 
from dynamical systems theory. 

Theorem 2. Let TL be a generic set of LFTs for which the following holds: (i) There 
exists a element p ofTL which possesses an indifferent point, i.e., a fixed point where the 
absolute value of the derivative equals 1, {ii) The LFT p does not belong to the final set 
T , (Hi) If Q denotes the setTL \ {p}, andAi, the setsp*Q, p*T , then conditions 
Q{M), are fullhlled. 

Then the Euclidean Algorithm associated to TL performs an average number of steps on 
valid nationals of I with denominator less than N that is asymptotically of log-squared 
type, Hff ~ i/jv ~ Ah log^ N. 

The average number Q H of good steps (i.e., steps that use elements of Q) performed by 
the Euclidean Algorithm on valid nationals of I with denominator less than N satisfies 
Qn ^ Qn ^ Aq log N , and the constant Aq involves the entropy of the dynamical 
system relative to set A4 . 

4 Average-Case Analysis of the Algorithms 

We now come back to the analysis of the eight algorithms, and we study successively 
the fast algorithms, and the slow algorithms 

4.1. The Fast Algorithms. We consider the generic sets Q, 1C, O relative to the Classical 
Algorithm, the Classical Centered Algorithm and the Odd Algorithm, or the Markovian 
sets U or C relative to the Ordinary or the Centered Algorithm. It is easy to verify that 
the conditions Q{Q), Q{IC), Q{0), Q(Ui){i = 0, 1), Q{Ci){i = 0, 1), hold. 

Moreover, at s = 2, the Ruelle operators can be viewed as density transformers. However, 
the dynamical systems to which they are associated may be complex objects, since they 
are random for the Odd Algorithm, and are both random and Markovian for the Ordinary 
and Centered Algorithm. The reason is that the three pseudo-divisions (odd, ordinary, 
centered) are related to dyadic valuation, so that continued fractions expansions are only 
defined for rationals numbers. However, one can dehne random continued fraction for 
real numbers when choosing in a suitable random way the dyadic valuation of a real 
number. Then, the Ruelle operator relative to each algorithm can be viewed as the transfer 
operator relative to this random dynamical system. Now, the application of Theorem 1 
gives our first main result: 

Theorem 3. Consider the five algorithms, the Classical Algorithm (G), the Classical 
Centered Algorithm (K), the Odd Algorithm (O), the Ordinary Algorithm (U) or the 
Centered Algorithm (C). The average numbers of division steps performed by each 
of these five algorithms, on the set of valid inputs of denominator less than N are of 
asymptotic logarithmic order. They all satisfy 

Hn ^ Hn { 2/h{H))logN for H € {G, K,0,U,C}, 
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where h{TL) is the entropy of the dynamical system relative to the algorithm. For the 
two first algorithms, the entropies are explicit, 

h{G) = 7r^/(61og2), h{IC) = 7r^/(61og(/)). 

Each of the previous entropies can be computed, by adapting methods developed in 
previous papers [3,22]. What we have at the moment is values from simulations that 
already provide a consistent picture of the relative merits of the Centered, Odd, and 
Ordinary Algorithms, namely, 

Ac « 0.430 ± 0.005 Aq « 0.435 ± 0.005 Au « 0.535 ± 0.005. 

It is to be noted that the computer algebra system Maple makes use of the Ordinary 
Algorithm, (perhaps on the basis that only unsigned integers need to be manipulated), 
although this algorithm appears to be from our analysis the fast one that has the worst 
convergence rate. 

4.2. The Slow Algorithms. For the Even algorithm and the By-Excess Algorithm, the 
“bad" LET is defined by p{x) := 1/(2 — x), with an indifferent point in 1. For the 
Subtractive Algorithm, the “bad" LET is defined by p(a:) := x/(l+x), with an indifferent 
point in 0. In the latter case, the induced set Jvi coincides with the set Q relative to the 
Classical Algorithm (G). 

When applying Theorem 2 to sets S, C, T, we obtain our second main result: 

Theorem 4. Consider the three algorithms, the By-Excess Algorithm (L ), the Subtractive 
Algorithm (T), the Even Algorithm (E). The average numbers of steps performed by 
each of the three algorithms, on the set of valid inputs of denominator less than N are 
of asymptotic log-squared order. They satisfy 

Lm ~ Lm ~ Al log^ N, Tm ~ Tm ~ At log^ N, Em ~ Em ~ Ae log^ N, 
withAE = (3/7t2), At = (6/7r^),AT = {2/^"^). 

The average numbers of good steps performed by the algorithms on the set of valid 
inputs of denominator less than N satisfy 

Pm ~ Pm ~ Ap log N Gm ^ G m ~ Aq log N , Mm ~ Mm ~ Am log N . 

withAp = (61og2)/7T^, Ag = (12 log2)/7r^, Am = ( 41 og 3 )/ 7 r^. 

4.3. Higher moments. Our methods apply to other parameters of continued fraction. 
On the other hand, by using successive derivatives of the double generating function 
S{s,w), we can easily evaluate higher moments of the random variable “number of 
iterations". 

Theorems, (z) For any integer £ > 1 and any of the five fast algorithms, the£-thmoment 
of the cost function is asymptotic to the ith power of the mean. In particular the standard 
deviation is o{logN). Consequently the random variable expressing the cost satisfies 
the concentration of distribution property. 

(ii) For any integer £ > 2 and any of the three slow algorithms, the £th moment of the 
cost function is of order and the standard deviation is 0{sfN). 
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Abstract. In this paper, we consider the open problem of the complexity of the 
LLL algorithm in the case when the approximation parameter of the algorithm 
has its extreme value 1. This case is of interest because the output is then the 
strongest Lovasz-reduced basis. Experiments reported by Lagarias and Odlyzko 
[13] seem to show that the algorithm remain polynomial in average. However no 
bound better than a naive exponential order one is established for the worst-case 
complexity of the optimal LLL algorithm, even for fixed small dimension (higher 
than 2). Here we prove that, for any fixed dimension n, the number of iterations 
of the LLL algorithm is linear with respect to the size of the input. It is easy to 
deduce from [17] that the linear order is optimal. Moreover in 3 dimensions, we 
give a tight bound for the maximum number of iterations and we characterize 
precisely the output basis. Our bound also improves the known one for the usual 
(non-optimal) LLL algorithm. 



1 Introduction 

A Euclidean lattice is a set of all integer linear combinations of p linearly independent 
vectors in M”. Any lattice can be generated by many bases (all of them of cardinality 
p). The lattice basis reduction problem is to find bases with good Euclidean properties, 
that is sufficiently short vectors and almost orthogonal. The problem is old and there 
exist numerous notions of reduction; the most natural ones are due to Minkowski or 
to Korkhine-Zolotarev. Eor a general survey, see for example [8,16]. Both of these 
reduction processes are “strong”, since they build reduced bases with in some sense best 
Euclidean properties. However, they are also computationally hard to find, since they 
demand the first vector of the basis should be a shortest one in the lattice. It appears that 
finding such an element in a lattice is likely to be NP-hard [18,1,5]. 

Eortunately, even approximate answers to the reduction problem have numerous the- 
oretical and practical applications in computational number theory and cryptography: 
Eactoring polynomials with rational coefficients [12], finding linear diophantine ap- 
proximations (Lagarias, 1980), breaking various cryptosystems [15] and integer linear 
programming [7,11]. In 1982, Lenstra, Lenstra and Lovasz [12] gave a powerful approx- 
imation reduction algorithm. It depends on a real approximation parameter <5 G [1, 2[ 
and is called LLL(5). It is a possible generalization of its 2-dimensional version, which 
is the famous Gauss algorithm. The celebrated LLL algorithm seems difficult to analyze 
precisely, both in the worst-case and in average-case. The original paper [12] gives an 
upper bound for the number of iterations of LLL((5), which is polynomial in the data 
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size, for all values of S except the optimal value 1 : When given n input vectors of M" of 
length at most M, the data size is 0(n^ log M) and the upper bound is log^ M + n. 
When the approximation parameter <5 is 1 , the only known upper-bound is M” , which 
is exponential even for hxed dimension. It was still an open problem whether the opti- 
mal LLL algorithm is polynomial. In this paper, we prove that the number of iterations 
of the algorithm is linear for any hxed dimension. More precisely, it is 0(A" log M) 
where A is any constant strictly greater than We prove also that under a 

quite reasonable heuristic principle, the number of iterations is 0((2/v/3)” log M). 
In the 3-dimensional case (notice that the problem was totally open even in this case), 
we provide a precise linear bound, which is even better than the usual bounds on the 
non-optimal versions of the LLL algorithm. Several reasons motivate our work on the 
complexity of the optimal LLL algorithm. 

1. This problem is cited as an open question by respected authors [4,17] and I think that 
the answer will bring at least a better understanding of the lattice reduction process. Of 
course, this paper is just an insight to the general answer of the question. 

2. The optimal LLL algorithm provides the strongest Lovasz-reduced basis in a lattice 
(the best bounds on the classical length defects and orthogonality defect). In many 
applications, people seem to be interested by such a basis [13], and sometimes even in 
fixed low dimension [14]. 

3. We believe that the complexity of finding an optimal Lovasz-reduced basis is of great 
interest and the LLL algorithm is the most natural way to find an optimal Lovasz-reduced 
basis in a lattice (we develop it more in the conclusion). 

Plan of the paper. Section 2 presents the LLL algorithm and introduces some definitions 
and notations. In Section 3, we recall some known results in 2 dimensions. Section 4 
deals with the worst-case complexity of the optimal LLL algorithm in 3-dimensional 
case. Finally, in Section 5, we prove that in any fixed dimension, the number of iterations 
of the LLL algorithm is linear with respect to the length of the input. 



2 General Description of the LLL Algorithm 



Let RP be endowed with the usual scalar product (•,•) and Euclidean length |u| = 
(m, The notation {u)±h will denote the projection of the vector u in the classical 

orthogonal space of H in R^*. The set {ui,U 2 , ■■■,Ur) denotes the vector space 
spanned by a family of vectors {ui,U 2 , Ur). A lattice of R^* is the set of all integer 
linear combinations of a set of linearly independent vectors. Generally it is given by one 
of its bases ( 6 i, 62 , ... , bn) and the number n is the dimension of the lattice. So, if M is 
the maximum length of the vectors bi, the data-size is {n? log M) and when working in 
fixed dimension, the data-size is 0(log M). The determinant det(L) of the lattice L is 
the volume of the n-dimensional parallelepiped spanned by the origin and the vectors of 
any basis. Indeed it does not depend on the choice of a basis. The usual Gram-Schmidt 
orthogonalization process, builds in polynomial-time from a basis b = (5i, 62 , ■ • ■ , bn) 
an orthogonal basis b* = ( 6 *, & 2 ) • ■ • ) bn) (which is generally not a basis for the lattice 
generated by b) and a lower-triangular matrix m = {rriij) that expresses the system b 
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into the system b*. By construction, 
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We recall that if L is the lattice generated by the basis b, its determinant det(L) is 
expressed in term of the lengths \b*\: det(L) = nr=i l^?l- 

The ordered basis b is called proper if \mij | < 1/2, for 1 < j < i < n. There exists a 
simple polynomial-time algorithm which makes any basis proper by means of integer 
translations of each hi in the directions of bj , for j decreasing from i — 1 to 1 . 



Definition 1. [12] For any S S [1, 2[, the basis (6i, . . . , 6„) is called S-reduced (or 
LLL(S)-reduced or 6-Lovdsz— reduced) ifitfullfils the two conditions: 



(i) (bi, bn) is proper. 

(ii) Vi G {1, 2, • • • , n — 1} 



(i/<5) m 






< l(^i+l) 






The optimal LLL algorithm ((5 = 1) is a possible generalization of its 2-dimensional 
version, the famous Gauss’ algorithm, whose precise analysis is already done both in the 
worst-case [10,17,9] and in the average-case [6]. In the sequel, a reduced basis denotes 
always an optimal LLL-reduced one. When talking about the algorithm without other 
precision, we always mean the optimal LLL algorithm. 

For all integer i in {1, 2, • • • , n — 1}, we call Bi the two-dimensional basis formed by 
the two vectors and (^)i+i) ^^.Then, by the previous Definition 

(&i, . . . , 6„) is reduced iff it is proper and if all bases Bi are reduced. 



Definition 2. Let t be a real parameter such that t > 1. We call a basis (6i, . . . , bn) 
t-quasi-reduced if it satisfies the following conditions: 

(i) the basis (6i, . . . , 6„_i) is proper. 

(ii) For all I < i < n — 2, the bases Bi are reduced. 

( Hi) The last basis Bn-i is not reduced but it is t-reduced: |m„ „_i | <1/2 and 

(l/t) I < I ^ I 1 )_L<fci I ' 

In other words, whenever the beginning basis (6i, • • • , 6„_i) is reduced, but the whole 
basis b = (5i, • • • , 6„) is not, then for all f > 1 such that the last two-dimensional basis 
Bi is f-reduced, the basis b is called f-quasi-reduced. 

Here is a simple enunciation of the LLL((5) algorithm: 
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The LLL(J) -reduction algorithm: 

Input: A basis b = {bi, . . . , bn) of a lattice L. 

Output: A LLL(l)-reduced basis b of the lattice L. 

Initialization: Compute the orthogonalized system b* and the matrix to. 

i := 1; 

While i < n do 

bi+i := bi+i — [rrii+i^i] bi ([x] is the integer nearest to x). 

Test: Is the two-dimensional basis Bi 5-reduced? 

If true, make (6i, . . . , 6i+i) proper by translations; set i := i + 1; 

If false, swap bi and 6i+i; update b* and to; if i ^ 1 then set i := i — 1; 

During an execution of the algorithm, the index i variates in {1, . . . , n}. It is called the 
current index. When i equals some k G {1, . . . ,n—l}, the beginning lattice generated by 
{bi, ... ,bk) is already reduced. Then, the reduction of the basis Bk is tested. If the test is 
positive, the basis (6i, . . . , bk+i) is made proper and the beginning lattice generated by 
(&i, . . . , bk+i) is then reduced. So, i is incremented. Otherwise, the vectors bk and bk+i 
are swapped. At this moment, nothing guarantees that {bi, . . . , bk) “remains” reduced. 
So, i is decremented and the algorithm updates b* and to, translates the new bk in 
the direction of bk-i and tests the reduction of the basis Bk-i. Thus the index i may 
fall down to 1. Finally when i equals n, the whole basis is reduced and the algorithm 
terminates. An example of the variation of the index i is shown by Figure 1. 



value of the current index i 




Fig. 1. Variation of the index i presented as a walk. 



In the sequel an iteration of the LLL algorithm is precisely an iteration of the “while” 
loop in the previous enunciation. So each iteration has exactly one test (Is the two- 
dimensional basis Bi reduced ?) and the number of steps is exactly the number of tests. 
Notice that whenever a test at a level i is negative, i.e. the basis Bi is not reduced, 
after the swap of bi and the determinant di of the lattice (6i, . . . , bi) is decreased. 
Moreover, for any f > 1, if at the moment of the test the basis Bi is even not f-reduced, 
the determinant di is decreased by a factor at least 1 /t. This explains the next definition. 

Definition 3. For a real parameter t > 1, a step of index i is called f-decreasing if at 
the moment of the test, the basis Bi is not t-reduced. Else, it is called f-non-decreasing. 

[12] pointed out that during the execution of a non-optimal LLL algorithm, say LLL(<5) 
for some 5 > 1, all steps with negative tests are i5-decreasing. Similarly, we assert the 
next lemma, based on the decrease of the integer quantity 
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D-.= m=id^--=u:^iiYj=i\b*\\ (2) 

by a factor 1/f^, whenever a step is f-decreasing (other steps do not make D increase). 

Lemma 1. Let the LLL(1) algorithm run on an integer input (&i, . . . ,5„) of length 
log M. For any t > 1, the number oft-decreasing steps is less than n(n — 1) /2 log^ M. 

Definition 4. A phase is a sequence of steps that occur between two successive tests of 
reduction of the last two-dimensional lattice B^-i .Fora real t > 1, we say that a phase 
is a t-phase if at the beginning of the phase the basis (&i, . . . 6„) is t— quasi— reduced. 
Phases are classified in two groups: A phase is called of type I if during the next phase 
the first vector bi is never swapped. Else, it is called of of type II ( see Figure 1 ). 



3 Some Known Results in 2 Dimensions: Gauss’ Algorithm 

In two dimensions a phase of the algorithm is an iteration (of the “while” loop) and the 
only positive test occurs at the end of the algorithm. Thus the number of steps equals the 
number of negative tests plus one. For any t > 1, before the input is f-quasi-reduced, 
each step is f-decreasing. So by Lemma 1 any input basis (&i , & 2 ) will be f-quasi-reduced 
within at most log^M steps. Then the next Lemma leads to the bound log^M + 2 for the 
total number of steps of Gauss’ algorithm. This hound is not optimal [17]. However, in 
next sections we generalize this argumentation to the case of an arbitrary fixed dimension. 
Notice that the Lemma does not suppose that the input basis is integral and this fact is 
used in the sequel (proof in [2]). 

Lemma 2. For any t g]1, x/S], during the execution of Gauss’ algorithm on any input 
basis (not necessarily integral), there are at most 2 t-non-decreasing steps. 



4 The 3-Dimensional Case 

Let f he a real parameter such that 1 < f < 3/2. Here we count separately the iterations 
that are inside f-phases and the iterations that are not inside f-phases. 

First we show that the total number of steps that are not inside f-phases is linear with 
respect to the input length log M (Lemma 3). Second we prove that the total number 
of iterations inside f-phases is always less than nine (Lemma 4). Thus we exhibit for 
the first time a linear bound for the number of iterations of the LLL algorithm in 3- 
dimensional space (the naive bound is M®.) In addition, our argumentation gives a 
precise characterization of a reduced basis in the three-dimensional space. 

Theorem 1. The number of iterations of the LLL(1 ) algorithm on an input integer basis 
{b\,b 2 ,bz) of length log M is less than log^ M + 6 logg /2 M + 9. 

The linear order for our bound is in fact optimal since it is so in dimension 2 [17] and 
one can obviously build from a basis 6 of n — 1 vectors of maximal length M, another 
basis b' of n vectors of the same maximal length such the number of iterations of the 
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LLL algorithm on the h' is strictly greater than on b. Moreover, even if we have not tried 
here to give the best coefficient of linearity in dimension 3, our bound is quite acceptable 
since [19] exhibits a family of bases of lengths log M, for which the number of iterations 
of the algorithm is greater than 2.6 log2 M + 4. Our bound is 13.2 log2 M + 9 . Observe 
that the classical bound on the usual non-optimal LLL(2/v^) is 28.9 log2 M + 2, and 
even computed more precisely as in Lemma 3 it remains 24. 1 log2 M + 2. So our bound, 
which is also valid for LLL(<5) with 6 < 3/2, improves the classical upper-bound on the 
number of steps of LLL((5) provided that 5 < 1.3. 

The next Lemma (proof in [2]) is a more precise version of Lemma 1 in the particular 
case of 3 dimensions. (It is used in the sequel for t < 3/2.) 

Lemma 3. Let the LLL algorithm run on a integer basis (&i , 62 , 63 ) of length log M. 
Let t be a real parameter such that 1 < t < s/3. The number of steps that are not inside 
any t-phase is less than log^ M + 6 logj M. 

4.1 From a Quasi-reduced Basis to a Reduced One 

Lemma 4. Let t be a real parameter in ]1, 3/2]. When the dimension is fixed at 3, there 
are at most three t-phases during an execution of the algorithm. The total number of 
steps inside t-phases is less than nine. 

The proof is based on Lemma 5 and Corollaries 2 and 1 . Lemma 5 shows that a f-phase 
of type I is necessarily an ending phase and has exactly 3 iterations. 

The central role in the proof is played by Lemma 7 and its Corollary 2 asserting that in 
3 dimensions, there are at most 2 t-phases of type II during an execution. 

Finally, Lemma 6 and Corollary 1 show that any t-phase of type II has a maximum 
number of 3 iterations' . 

Remarks. (1) For t chosen closer to 1 (i/6/5 rather than 3/2), if during an execution, 
a t-reduced basis is obtained, then a reduced one will be obtained after at most 9 steps 
(see [2]). (A t-phase is necessarily followed by another t-phase.) (2) Of course, the 
general argumentation of the next section (for an arbitrary fixed dimension) holds here. 
But both of these directions lead to a less precise final upper bound. 

Lemma 5. For all t c] 1, s/3], a t-phase of type I has 3 steps and is an ending phase. 

Proof. The vector h\ will not be modified during a phase of fype I. Then, by Lemma 2, 
fhe basis ( (62)_Lbi j (&3)_Lbi ) will be reduced after only fwo iterations^. But here, there is 
one additional step (of current index 1, with a positive test) between these two iterations. 

Lemma 6 . For any t > 1, if a basis ( 61 , ... , b„-i, bn) is t-quasi-reduced, then the 
basis (5i, . . . , 6 „- 2 ) bn) is t' -quasi-reduced, with t' > {2/s/3)t. 

* These facts are used to make the proof clearer but they are not essential in the proof: actually, 
if a phase has more than 3 iterations, then these additional steps (which are necessarily with 
negative tests and with the index i equal to 1) are t-decreasing and all t-decreasing steps are 
already counted by Lemma 3. 

^ Lemma 2 does not demand the input basis to be integral. 
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Corollary 1. In 3 dimensions, for all f S]l, 3 / 2 ], a t-phase of type II has 3 steps. 

Proof Since {h\, 62, 63) is 3 / 2 -quasi-reduced, by the previous Lemma, (61, 63) is s/i- 
quasi-reduced. Then by Lemma 2 , (61, 63) will be reduced after 2 steps. 

The next Lemma plays a central role in the whole proof. This result which remains true 
when (&i, &2) ^3) is reduced gives also a precise characterization of a 1 -Lovasz reduced 
basis in dimension 3 . A detailed proof is available in [ 2 ]. 

Lemma 7. For all t s]l, 3 / 2 ], if the basis (bi, 62, 63) is t-quasi-reduced and proper, 
then among all the vectors of the lattice that are not in the plan {bi, 62), there is at most 
one pair of vectors ±u whose lengths are strictly less than ]63j. 

Proof (sketch). Let u := xbi + yb2 + zb-^he a. vector of the lattice ((a;, y, z) € 2 ^). 
The vector u is expressed in the orthogonal basis b* defined by ( 1 ) and its length satisfies 

|m|2 = + + zm3if \bl\'^ + (y + zm32)'^ |i> 2 p + 

First, since (&i, &2, ^3) is 3 / 2 -quasi reduced, one gets easily that if jzj > 1 or jj/j > 1 or 
\x\ > 1 , then juj > j&sj. Now, if z = 1 , by considering the ratio jt6j^/]63j^, one shows 
that there exits at most one pair (x, y) G { 0 , 1 , — !}^\{( 0 , 0 )} such that ju] < I63I. This 
unique vector depends on the signs of m2i, m3i et TO32 as recapitulated by Table 1 . 



u 


63 — &2 


i >3 “ ^1 + 


bs + bi — 62 


&3 + 62 


&3 — fel — &2 


bs + 62 


63 — &2 


bs bi 62 


0121 


-f 


-f 


-f 


-f 


- 
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-f 
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-b 


-b 
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- 


0132 


-f 


- 


-f 


- 


-b 


- 


-b 


- 



Table 1. The unique vector possibly strictly shorter than 63 , as a function of signs of rriij. 



Corollary 2 . During an execution of LLL( I ) algorithm in three dimensions, for all 
t g] 1 , 3 / 2 ], there are at most two t-phases of type II. 

Proof. Assume (bi,b2,b3) is the f-quasi-reduced basis at the beginning of a first f- 
phase of type 11 and and let (&i, 62, b'f) denote the basis obtained from (61, 62: ^3) by 
making the latter proper. Since the f-phase is of type 11 , I&3I < |6ij and the algorithm 
swaps ]6ij and jbgj. As (bi, 62) is Gauss-reduced, bi is a shortest vector of sub-lattice 
generated by (61, 62)- Thus the fact < ]6ij together with the previous Lemma show 
that in the whole lattice there is at most one pair of vectors ±u strictly shorter than b'3. 
So the vector 6g can be swapped only once. In particular, only one new f'-phase (for any 
f' > 1 ) of type 11 may occur before the end of the algorithm and after the first f-phase 
(t < 3 / 2 ) of type 11 , all phases except eventually one have exactly 2 iterations. 

5 Arbitrary Fixed Dimension n 

In the previous section, we argued in an additive way: We chose a tractable value to 
( 3/2 in the 3 dimensions) such that for 1 < f < fg we could easily count the maximum 
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number of steps inside all f-phases. Then we added the last bound (9 in the last Section) 
to the total number of iterations that were not inside f-phases. 

Here we argue differently. On one hand, the total number of f-decreasing steps is clas- 
sically upper bounded by logj M (Lemma 1). Now, for all f > 1, we call a t-non- 
dec reusing sequence, a sequence of f-non-decreasing consecutive steps. During such 
a sequence just before any negative test of index i, the basis (5i, . . . , h+i) is f-quasi- 
reduced. The problem is that during a t-non-decreasing sequence, we cannot quantify 
efficiently the decreasing of the usual integer potential function^ D (whose definition is 
recalled in (2) ). The crucial point here (Proposition 1) is that for all f g]1, v^], there 
exists some integer c(n, f) such that any f-non-decreasing sequence of the LLL(l) al- 
gorithm - when it works on an arbitrary input basis (6i, . . . , 6„) (no matter the lengths 
of the vectors)- has strictly less than c{n, t) steps. In short, any sequence of iterations, 
which is longer than c(n, t), has a t-decreasing step. 

Hence, our argumentation is in some sense multiplicative since the total number of 
iterations with negative tests is thus bounded from above by c(n, t)ri^ log^ M. We deduce 
the following theorem which for the first time exhibits a linear bound on the number of 
iterations of the LLL algorithm in fixed dimension. 

Theorem 2. For any fixed dimension n, let the optimal LLL algorithm run on an integer 
input (6i, . . . , bn) of length log M. The maximum number of iterations K satisfies: 

(i) for all constant A > {2/ K is 0(A^^ logM). 

(ii) under a very plausible heuristic, K is also (9^(2/-\/3)" logM^. 

The first formulation (i) is based on Proposition 1, and Lemmata 1,8. For the second 
formulation (ii) we use also Lemma 9 (proved under a very plausible heuristic). 

The next Lemma is an adaptation of counting methods used by Babai, Kannan and 
Schnorr [3,7,14] when finding a shortest vector in a lattice with a Lovasz-reduced basis 
on hand. For a detailed proof, see [2]. 

Lemma 8. Let t g]1, 2[ be a real parameter and C be a lattice generated by a basis 
b := (6i, . . . , bn), which is not necessarily integral and whose vectors are of arbitrary 
length. Ifb is proper and t-quasi-reduced then there exists an integer a(n, t) such that 
the number of vectors of the lattice C whose lengths are strictly less than |6i| is strictly 
less than a{n, t). Moreover, 

a{n,t) < \/3fV(4-f2) 3"-i (2/^3)^^^. (3) 

Remark. The sequence a{n, f) is increasing with n (and also with t). 

Proposition 1. Let n be a fixed dimension and t a real parameter ;« ] 1, -\/3]- There exists 
an integer c(n, t) such that the length of any t-non-decreasing sequence of the LLL(1) 
algorithm - on any input basis (hi, ... , bn), no matter the lengths of its vectors and no 
matter the basis is integral - is strictly less than c(n, t). 

^ The naive bound is obtained using only the fact that D is a strictly positive integer less than 
and it is strictly decreasing at each step with a negative test. 
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Proof (sketch). By induction on n. The case n = 2 is trivial and c(2,t) = 2 (Lemma 
2). Suppose that the assertion holds for any basis of n — 1 vectors and let the algorithm 
run on a basis 6 := (&i , . . . , Let us consider the longest possible f-non-decreasing 
sequence. After at most c(n — 1, f) f-non-decreasing steps, b is f-quasi-reduced . 

If the next phase if of type I, then the algorithm works actually with the basis of cardinality 
(n — 1), := (( 62 )_Lbij • ■ • , {bn)±bi), which is also f-quasi-reduced. Then by the 

induction hypothesis, the f-non-decreasing sequence will be finished after at most c(n — 
1 , t) + a{n — 1 , t) more steps^. 

On the other hand, there are at most a(n, f) successive phases of type II, since Lemma 
8 asserts that the first vector of the f-quasi-reduced basis (bi, ... ,bn) can be modified 
at most a(n,t) times. Each of them has no more than c(n — l,f) steps, because the 
algorithm works actually on ( 6 i, . . . , 5„-i). 

After the last f-phase of type II, it may be one more f-phase of type I. Finally, since 
a(n, f) is increasing with respect to n, the quantity c(n, f) is less than 

c{n—l,f) + c{n—l,f)a{n,i) + c{n—l,i) + a{n—l,f) < (c(n— l,f) + l)(o;(n, f) + 2 ), 

and finally c(n, f) + 1 < (c( 2 , t) + 1) n ”=2 (*^(*> ^) + 2) • (4) 

Proof ((i) of Theorem 2). Each sequence of c(n,t) steps contains at least one t- 
decreasing step. At each f-decreasing step, the quantity D, which is always in the 
interval [1, M" ], is decreasing by at least 1/t. So the total number of iterations of the 
algorithm is always less than c(n, f)nf log^ M. Now by choosing a fixed t s]l, \/3], 
relations (3) and (4) together show that the quantity n?c(n, f) is bounded from above by 
A" , where A is any constant greater than (2/-\/3)^^®. 

In the first proof, we choose for t an arbitrary value in the interval ]1, x/3]. Now, we 
improve our bound by choosing f as a function of n. What we really need here is to 
evaluate the number of possible successive f-phases of type II. So the main question is: 
When a basis ( 6 i, . . . , 6 „) of a lattice L is f-quasi-reduced, how many lattice points 
u are satisfying (l/f)| 6 i| < |u| < |&i|? More precisely, is it possible to choose t, as 
a function of dimension n, such that the open volume between the two n-dimensional 
balls of radii | 6 i| and (l/f)| 6 i| does not contain any lattice point? 

Now, we answer these questions under a quite reasonable heuristic principle which is 
often satisfied. So the bound on C(n) and on the maximum number of iterations will be 
improved. This heuristic is due to Gauss. Consider a lattice of determinant det(L). The 
heuristic claims that the number of lattice points inside a ball B is well approximated 
by volume(,B)/ det(L). More precisely the error is of the order of the surface of the 
ball B. This principle holds for very large class of lattices, in particular those used in 
applications (for instance “regular lattices” where the minima are close to each other 
and where the fundamental parallelepiped is close to a hyper-cube). Moreover, notice 
that this heuristic also leads to the result of Lemma 8 . 

Otherwise, there would be a i-non-decreasing sequence of more than c(n — 1, f) steps while 
the algorithm runs on the basis (bi, . . . , bn-i). 

^ During the c(n — l,f) steps on fex6i, each change of the first vector (& 2 )x 6 i (no more than 
a{n — 1, t), by Lemma 8) is followed by one step (of current index one) with a positive test 
which has not been counted yet. 
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Under this assumption and if 7 „ = 7 t"/^/U (1 + nl2) denotes the volume of the n- 
dimensional unit ball, then the number of lattice points (3{n,t) that lie strictly between 
balls of radii | 6 i| and {l/t)\bx \ satisfies (at least asymptotically) 

/ 3 (n,f„)< 7 n (| 6 ir/det(L)) (1 - 1/r). (5) 

Now if ( 6 i, . . . , bn) is f-quasi-reduced with t < Vs, 

\b^r/det{L) = \b,r/U7=i\b*\ < 3” (2/V3)^. 

Then, using the classical Stirling approximation, (3{n, t) is bounded from above: 
j3{n,t) <3 ( 7 re/n)”/^ {2/VS) ^ 2 ' (1 — 1 /f"). 

By routine computation, we deduce the following Lemma. 

Lemma 9. Suppose that there exists no such that for n > no, relation (5) is true. Then 
there exists a sequence > 1 satisfying: 

(i) (fn) is decreasing and tends to 1 with respect to n. 

(ii) foralln>no, P{n,t„) < 1 and l/logf„ < 3n (7re/n)”/^ {2/VS) V \ 
Remark: One deduces from (i) that if (3{n — 1, tn) = 0, then /3(n — 1, f„_i) = 0. 
Proof (sketch for (ii) of Theorem 2). 

The quantities and no are defined by the previous Lemma. First we prove that for 
n > no and with the notations of Proposition!, 

c(n, t„) < c{n - 1 , tn) + {c(no, tn) + ct((no, tn))- ( 6 ) 

Indeed, after c(n — l,f) steps, ( 61 ,..., 6 „) is f„-quasi-reduced and if H de- 
notes the vector space ( 61 , . . . , 6 „_„p_i), the (no (-dimensional basis b±jj := 
( {bn-no)±H, ■ • • , {bn)±H ) IS f„-quasi-reduced as well. Thus by the previous Lemma 
during the f„-non-decreasing sequence, its first vector cannot be modified. So from the 
first time that (bi, ... ,bn) is <„-quasi-reduced until the end of the f„-non-decreasing 
sequence the current index i will always be in the integral interval {n — no, . . . , n}. 
Then by Proposition 1 the sequence of f„-non-decreasing iterations may continue for 
at most c(no, f„) + a(no, more iterations.® This ends the proof of relation (6). 

So for n > no, c(n, f„) < (n - hq + l)(c(no, f„) -f a(no, t„)). 

Since tn < tn^, the basis b±H is also -quasi-reduced and by Lemma 8 , o;(no, f„) < 
a{rio,tng). (The same relation is true for k < riQ.) Finally the quantity c(no,fn) + 
a(no,tn) is a constant B that depends only on no. We have then c(n,tn) < nB. So 
a sequence longer than nB contains always a f„-decreasing step and the total number 
of iterations is less than nBn^ M. Finally Lemma 9 gives an upper-bound for 
1 /log tn and leads to the (ii) of Theorem 2. 

® The quantity a (no, t„) corresponds to the maximum number of positive tests with the current 
index i = n — no, after the the first time b±H is fn-quasi-reduced. 
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6 Conclusion 

Our paper gives for the first time linear bounds for the the maximum number of iterations 
of the optimal LLL algorithm, in fixed dimension. 1 believe that the complexity of finding 
an optimal Lovasz-reduced basis is of great interest and not well-known. 

Kannan presented [7] an algorithm which uses as sub-routine the non-optimal LLL 
algorithm (S > 1) and outputs a Korkine-Zolotarev basis of the lattice in 0(n”) log M 
steps. Such an output is also an optimal Lovasz-reduced basis (Actually it is stronger). 
Thus Kannan’s algorithm provides an upper-bound on the complexity of finding an 
optimal Lovasz-reduced basis^. For the future, one of the two following possibilities (or 
both) has to be considered. 

(1) Our upper-bound is likely to be improved. However, observe that in this paper we 
have already improved notably the naive bound for fixed dimension (the exponential 
order is replaced by linear order). For the moment our bound remains worse than the 
one Kannan exhibits for his algorithm. 

(2) The LLL algorithm which is the most natural way to find an optimal Lovasz-reduced 
basis is not the best way (and then the same phenomenon may be possible for finding 
a non-optimal Lovasz-reduced basis: more efficient algorithms than the classical LLL 
algorithm may output the same reduced basis). 
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Abstract. Algebras whose underlying set is a complete partial order 
and whose term-operations are continuous may be equipped with a least 
fixed point operation /ix.t. The set of all equations involving the /i- 
operation which hold in all continuous algebras determines the variety 
of iteration algebras. A simple argument is given here reducing the ax- 
iomatization of iteration algebras to that of Wilke algebras. It is shown 
that Wilke algebras do not have a finite axiomatization. This fact im- 
plies that iteration algebras do not have a finite axiomatization, even by 
“hyperidentities” . 



1 Introduction 

For a fixed signature Al, a /x/A7-algebra 2t = {A, is a A7-algebra equipped 

with an operation for each /i/Af-term t. Algebras whose underlying set A 

is equipped with a complete partial order and whose basic operations a : A" — >■ A 
are continuous, determine /i/A7-algebras in which iix.t is defined using least fixed 
points (see below). The variety of fi/ S -algebras generated by these continuous 
algebras is the variety of ^/A7-iteration algebras. Such algebras have been used in 
many studies in theoretical computer science (for only a few of many references, 
see [14,8,15,9,11,12,1].) 

The main theorem in the current paper shows that the identities satisfied by 
continuous algebras are not finitely based. This result has been known for some 
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time, [6], but only in an equivalent form for iteration theories. In this note 
we give an argument which may have independent interest. We show how to 
translate “scalar” iteration algebra identities into Wilke algebra identities, [17]. 
Since the identities of Wilke algebras are not finitely based, as we show, the 
same property holds for iteration algebras. 

In fact, we prove a stronger result. We show there is no finite number of hyper- 
identities which axiomatize iteration algebras. Our notion of “hyperidentity” is 
stronger than that introduced by Taylor [19]. (In this extended abstract, we will 
omit many of the proofs.) 



2 17-Terms and Algebras 

In this section, we formulate the notion of a /r/T'-algebra, where if is a signature, 
i.e., a ranked set. We do not assume that the underlying set of an algebra is 
partially ordered. Let V = {xi,X 2 , . . . } be a countably infinite set of “variables”, 
and let U = (ifo, ■ . ■ ) b® ^ ranked alphabet. The set of /x/i7-terms, denoted 

Tu, is the smallest set of expressions satisfying the following conditions: 

— each variable is in Tu; 

~ if cr € Sn and ti, . . . ,tn € Ts, then cr(ti, . . . , is in T^; 

— if X € V and t G T^, then /ix.t is in Ts- 

Every occurrence of the variable x is bound in fxx.t. The free variables occurring 
in a term are defined as usual. We use the notation t = t[xi , . . . , x„] to indicate 
that t is a term whose free variables are among the distinct variables x\, . . . , x„, 
and no hound variable is among x\, . . . ,x„. Perhaps confusingly, we write 

t[tl/xi, • . • , tnjXri\ 

to indicate the term obtained by simultaneously substituting the terms ti for 
the free occurrences of Xi in t, for i = 1, . . . ,n. (By convention, we assume no 
variable free in ti becomes bound in t[ti/x\, . . . ,t„/a:„].) But here, we do not 
rule out the possibility that there are other free variables in t not affected by 
this substitution. 



Definition 1. A p/ S-algebra consists of a S-algebra 21= {A,a^)a^s arid an 
assignment of a function t^ : Gl” A to each p/S-term t = t[xi , . . . , x„] which 
satisfies the (somewhat redundant) requirements that 



1 . 



(^1 7 • ■ • 1 an} ai, i l,2,...,7z, 
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2 . for each a € Sn, 

, ^n)) (^1) • ■ • ; ^ (^1 ■ 5 ^n) j ■ 5 ■ ■ ■ 5 ^n))^ 

3. if s and t differ only in their bound variables, then s^ = t^; 

4 . if s = s[xi, . . . ,Xn], t = t[xi, ... ,Xn] and if 

S (cii , . . . , On) — t (gi , . . . , a^i), 

for all ai, .. . , a„ G A, then, for each i G [n], all aj G A, 1 < j < n, j ^ i, 

(/iXj .s) (^ai, . . . Qi—i, . . . , Ctyi) — (^fJiXi.t^ , ttn) 5 

5. if t = t[xi, . . . ,Xn\, then the function t^ depends on at most the arguments 
corresponding to the variables occurring freely in t. 

// 2t and IB are ji/ S -algebras, a morphism : 21 — >■ 03 zs a function A ^ B 
such that for all terms t = t[x\, . . . , Xn], and all ai, . . . ,a„ G A, 

(p{t^{ai, ... , a„)) = t^{(p{ai), ... , (^(a„)). 

In particular, a morphism of ^/H-algebras is also a A'-algebra morphism. 

As usual, we say that a /i/A-algebra 2t satisfies an identity s » t between 
/i/A-terms if the functions s^ and t^ are the same. 



3 Conway and Iteration Algebras 

Definition 2. A p,/ E-Conway algebra is a p/ E-algebra satisfying the double 
iteration identities (1) and composition identities (2) 

px.py.t « pz.t[z/x, z/y] (1) 

px.s[r/x] ~ s[px.r[s/x\ /x], (2) 

for all terms t = t[x, y, z\, . . . , Zp], s = s[x, Zi, . . . , Zp], and r = r[x, zi, . . . ,Zp] 
in T^. A morphism of Conway algebras is just a morphism of p/ E -algebras. 

The class of Conway algebras is the class of all /x/A-Conway algebras, as E 
varies over all signatures. 

Letting r be the variable x in the composition identity, we obtain the following 
well known fact. 



Lemma 1. Any p/ E-Conway algebra satisfies the fixed point identities 

px.s ~ s[px.s/x\. 
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In particular, 



lix.s « s, 

whenever x does not occur free in s. 

We will be mostly concerned with scalar signatures. A signature S is scalar if 
En = 0 for n > 1. The class of scalar Conway algebras is the collection of all 
/i/A-Conway algebras, as E varies over all scalar signatures. 

Proposition 1. Suppose that E is a scalar signature. If is a p,/ E-algehra 
satisfying the composition identities, then the double iteration identities (1) hold 
in 21. Moreover, a p,/ E -algebra 21 zs a Conway -algebra iff (2) holds in 21 for all 
terms r = r[x] and s = s[x]. □ 

Note again that unlike most treatments of /i/A-algebras, we do not assume that 
such an algebra comes with a partial order with various completeness properties 
guaranteeing that all functions which are either monotone and/or continuous 
have least fixed points. We say that a ^/A-algebra 21 = {A, is con- 

tinuous if the underlying set A is equipped with a directed complete partial 
order, and each basic function cr^ : A” — >• A preserves all sups of directed sets; 
the /i-operator is defined via least fixed points. (See [8].) For example, when a 
term t[x,y] denotes such a function : A x A — >• A, say, then px.t denotes the 
function A — >■ A whose value at 6 G A is the least a in A such that a = t^{a, b). 
For us, px.t is interpreted as a function which has no particular properties. Of 
course, we will be interested in classes of algebras in which these functions are 
required to satisfy certain identities. 

Definition 3. A p/ E-algebra is a p / E -iteration algebra if it satisfies all 
identities satisfied by the continuous E-algebras or, equivalently, the regular E- 
algebras of [15], or the iterative E-algebras of [11,16]. A morphism of iteration 
algebras is a morphism of p/ E-algebras. 

For axiomatic treatments of iteration algebras, see [2]. It is proved in [7] that an 
p/ E-algehra, is an iteration algebra iff it is a Conway algebra satisfying certain 
“group-identities” . 

When E is either clear from context or not important, we say only iteration 
algebra instead of “/i/A-iteration algebra”. The class of scalar iteration al- 
gebras is the class of all /i/A-iteration algebras, as E varies over all scalar 
signatures. The following is well known [2]. 

Proposition 2. //2t is an iteration algebra, 21 satisfies the power identities: 
for each term t = t[x, yi, . . . , yp], 

px.t ~ px.C, n > 1, 

where C = tandt^^^ := t[t^/x\. □ 
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4 Scalar /r/ 17-Iteration Algebras 

We have already pointed out in Proposition 2 that any iteration algebra satisfies 
the power identities. It turns out that for scalar signatures, the composition and 
power identities are complete. 

Theorem 1. When E is scalar, a E-algebra 21 is an iteration algebra iff % 
satisfies the composition identities and the power identities. 

Proof. We need prove only that if 2t satisfies the composition and power iden- 
tities, then 21 satisfies all iteration algebra identities. The idea of the proof is to 
show that in fact 21 is a quotient of a free iteration algebra. □ 



5 Wilke Algebras 

A Wilke algebra is a two-sorted algebra A = equipped with an 

associative operation Af xAf — >■ Af, written u-v, a binary operation Af x A,^ — >■ 
A^, written u ■ x, and a unary operation Af —>■ A„, written ufi which satisfies 
the following identities: 



{u ■ v) ■ w = u ■ {v ■ w), 


u,v,w € Af 


(3) 


{u ■ v) ■ X = u ■ {v ■ x), 


u,v G Af, X G 


(4) 


{u ■ = u - {v ■ u)f 


u,v G Af, 


(5) 


= u€A 


f, n > 2. 


(6) 



(See [17], where these structures were called “binoids”. In [18,13], it is shown 
how Wilke algebras may be used to characterize regular sets of w- words.) 

A morphism h = (hf,hu,) : A ^ B of Wilke algebras A = and 

B = (Bf,B,^), is a pair of functions 

hf : Af ^ Bf 



which preserve all of the structure, so that ft./ is a semigroup morphism, and 

huj(u^) = hf{u)\ uGAf 
hu,{u ■ x) = hf{u) ■ hc„{x), uGAf,xGA,,j 

The function Af x A,,, — >■ A,^ is called the action. We refer to the two equations 
(3) and (4) as the associativity conditions; equation (5) is the commutativ- 
ity condition, and (6) are the power identities. 
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We will also consider unitary Wilke algebras, which are Wilke algebras A = 
in which 71/ is a monoid with neutral element 1, which satisfies the 

unit conditions: 



l-u = u = u-l, u€Af 
1 ■ X = X, X € Ai^. 

A morphism of unitary Wilke algebras h : A ^ B is a morphism of Wilke 
algebras which preserves the neutral element. (It follows that = 1^.) 

We will need a notion weaker than that of a unitary Wilke algebra. 

Definition 4. A unitary Wilke prealgebra (Af,Ai^) is an algebra with the 
operations and constant of a unitary Wilke algebra whose operations need to 
satisfy only the associativity and unit conditions. A morphism h = (hf,hi^) : 
(Af,Ai^) — >■ (Bf,B^) is defined as for unitary Wilke algebras. 

6 Axiomatizing Wilke Algebras 

We adopt an argument from [4] to show that unitary Wilke algebras do not have 
a finite axiomatization. Thus, neither do Wilke algebras. In particular, we prove 
the following fact. 

Proposition 3. For each prime p > 2 there is a unitary Wilke prealgebra Mp = 
{Mf, which satisfies the commutativity condition (5) and allpower identities 
(6) for integers n < p. However, for some u G Mf, . Thus, (unitary) 

Wilke algebras have no finite axiomatization. 

Proof. Let -L,T be distinct elements not in N, the set of nonnegative integers. 
We define a function pp : N — >■ {_L,T} as follows. 

It if p divides n 
I^T otherwise. 

Let Mf = N, and = {T, T}. The monoid operation u ■ v on Mf is addition, 
u + V, and the action of M/ on M,^ is trivial: u ■ x = x, for u G Mf, x G M,,j. 
Lastly, define = Pp{u). It is clear that {Mf, is a unitary Wilke prealgebra 
satisfying the commutativity condition. 

Now, p divides m” = nu iff p divides n or p divides u. Thus, in particular, for 
n < p, (m”)^ = ub Also, {uvY = u{vuy , since uv = vu and the action is trivial. 
But if M = 1, then = p and = T yf T = pi = (mP)I. 

Now if there were any finite axiomatization of unitary Wilke algebras, then, by 
the compactness theorem, there would be a finite axiomatization consisting of 
the associativity, commutativity and unit conditions, together with some finite 
subset of the power identities. This has just been shown impossible. □ 
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7 From Iteration Algebras to Wilke Algebras and Back 

Suppose that A" is a signature, not necessarily scalar, and that IB is a /x/i7- 
algebra. Let Af denote the set of all functions : B ^ B ior H-terms t = t[x] 
having at most the one free variable x, and let denote the set of elements 
of the form , for /i/A'-terms t having no free variables. We give A = {Af, A^,) 
the structure of a unitary Wilke prealgebra as follows. 

For Ui = tf G Af, i = 1,2, and w = G A^, 

ai- a 2 '.= {ti[t 2 /x])^-, ai ■ w := {ti[s / x])^ ■, a\ := {^x .ti)^ . 



Proposition 4. With the above definitions, {Af,A,^) is a unitary Wilke prealge- 
bra. Moreover, (A/, is a unitary Wilke algebra iff^ is an iteration algebra. 

□ 



Notation: We write iBW for the unitary Wilke algebra determined by the 
iteration algebra IB. 

We want to give a construction of a scalar /x/B'-algebra 2t[B] from a unitary 
Wilke prealgebra B = {Bf,B,^). So let B = {Bf, B,^) be a unitary Wilke preal- 
gebra. Define the scalar signature S as follows: 

i7i := {ab'.bG Bf} 

Mo ■■= : z G 

Let the underlying set A of 2l[S] be 5/ U B^. The functions for a G Mi are 
defined as follows: 



<J^{w) := b ■ w. 

For z G B„j, := z. Lastly, we define the functions for all /x/L'-terms 
t = t[x\, by induction on the structure of the term t. By Lemma 1, we need 
only consider the case t = jjLx.s[x\ and x occurs free in s. In this case, s is 
CT&i (. . . CTbj,(x) . . . ), for some fc > 0 and bj G Bf. If fc = 0, {/ax.x)^ := L^; 
otherwise, 

iljx.ab,{. ..(7b^{x) . . .))^ := {bi ■ . . . ■ bk)^ . 



Proposition 5. 2l[B] is a Conway- algebra iff B satisfies the commutativity con- 
dition. □ 

Lemma 2. For each n>2, the Conway algebra 2t[B] satisfies the power identity 
lix.t « fxx.C, for all terms t = t[x] iff B satisfies the identity x^ « (a;")b □ 
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Corollary 1. 2l[i?] is an iteration algebra iff B is a unitary Wilke algebra. 



Corollary 2. For any finite subset X of the power identities, there is a scalar 
Conway algebra 21 which satisfies all of the identities in X, but fails to satisfy 
all power identities. 



Proof. This follows from the previous lemma and Proposition 3. □ 

Remark 1. When B = (Bf,B,^) is generated by a, set Bq C Bf, one may reduce 
the signature of 2l[i3] to contain only the letters associated with the elements of 
Bq. By the same argument one obtains the following stronger version of Corol- 
lary 2: For any finite subset X of the power identities, there is a scalar Conway 
/i/27-algebra 21 having a single operation which satisfies all of the identities in 
X, but fails to satisfy all power identities. 



Corollary 3. Each unitary Wilke algebra B is isomorphic to 2l[i?]W. □ 



8 Hyperidentities 

A ‘hyperidentity’ differs from a standard identity only in the way it is inter- 
preted. Suppose that Z\ is a fixed ranked signature in which Z\„ is countably 
infinite, for each n > 0. For any signature E, an identity s ~ t between p./A 
terms is said to be a hyperidentity of the /x/A-algebra 21, if for each way of 
substituting a /i/A term ts[xi,. . . ,Xn,y] for each letter 5 G Z\„ in the terms s,t, 
the resulting /i/A-identity is true in 21. Thus, the operation symbols in A may 
be called “meta-operation symbols” . This definition of “hyperidentity” extends 
the notion in Taylor [19], in that terms with more than n variables are allowed 
to be substituted for n-ary function symbols. 

For example, if F, G G Z\i, then 

PLX.F{G{x)) « F{nx.G{F{x))) (7) 

is a hyperidentity of the class of all iteration algebras. Indeed, this is just a 
restatement of the composition identity. 

The following proposition follows from our definition of hyperidentity. 

Proposition 6. The two hyperidentities (7) and 

p,x.fiy.F{x,y) » nz.F{z,z) 



axiomatize the Conway algebras. 



□ 
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Each group identity mentioned above may be formulated as a hyperidentity (con- 
taining no free variables), as well. Thus, iteration algebras can be axiomatized 
by an infinite set of hyperidentities. 

Theorem 2. There is no finite set of hyperidentities which axiomatize iteration 
algebras. 

Proof idea. Suppose that there is a finite set E of hyperidentities that axiomatize 
iteration algebras. Then there is a finite set E' of scalar hyperidentities such that 
a scalar /r/T’-algebra is an iteration algebra iff it is a model of the hyperidentities 
E' . The equations in E' are obtained from those in E by replacing each meta- 
operation symbol F{xi, . . . , x„), n > 1, by unary meta-operation symbols fi{xj) 
in all possible ways. (For example, E{x,y) « F{y,x) gets replaced by the two 
equivalent equations /i(a;i) ~ fi{x 2 ) and and f 2 (xi) = f 2 {x 2 ).) When the rank 
n of E is zero, F remains unchanged. Now if a scalar hyperidentity s ~ t holds 
in all iteration algebras, then either no variable occurs free in either s or t, or 
both sides contain the same free variable. We then translate each such scalar 
hyperidentity s « t in the finite set E' into an identity tr(s) « tr(t) between 
two unitary Wilke prealgebra terms. The translation t i— >■ tr{t) is by induction 
on the structure of the term t. For example, if x has a free occurrence in t, 
the translation of fix.t is tr(t)l; otherwise tr{fj,x.t) is tr{t). We then show the 
resulting set of identities tr{s) ~ tr{f) together with the axioms for unitary Wilke 
prealgebras gives a finite axiomatization of unitary Wilke algebras, contradicting 
Proposition 3. □ 



9 Conclusion 

Although most equational theories involving a fixed point operation which are 
of interest in theoretical computer science are nonfinitely based, several of them 
have a finite relative axiomatization over iteration algebras. Examples of such 
theories are the equational theory of Kleene algebras of binary relations, or 
(regular) languages, the theory of bisimulation or tree equivalence classes of 
processes equipped with the regular operations, etc. See [10,3,7] and [5]. Since the 
nonfinite equational axiomatizability of these theories is caused by the nonfinite 
axiomatizability of iteration algebras, the constructions of this paper may be 
used to derive simple proofs of the nonfinite axiomatizability of these theories 
as well. 
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Abstract. Lossy counter machines are defined as Minsky n-counter machines 
where the values in the counters can spontaneously decrease at any time. While 
termination is decidable for lossy counter machines, structural termination (termi- 
nation for every input) is undecidable. This undecidability result has far reaching 
consequences. Lossy counter machines can be used as a general tool to prove the 
undecidability of many problems, for example (1) The verification of systems that 
model communication through unreliable channels (e.g. model checking lossy 
fifo-channel systems and lossy vector addition systems). (2) Several problems for 
reset Petri nets, like structural termination, boundedness and structural bound- 
edness. (3) Parameterized problems like fairness of broadcast communication 
protocols. 



1 Introduction 

Lossy counter machines (LCM) are defined just like Minsky counter machines [19], but 
with the addition that the values in the counters can spontaneously decrease at any time. 
This is called Tossiness’, since a part of the counter is lost. (In a different framework this 
corresponds to lost messages in unreliable communication channels.) There are many 
different kinds of lossiness, i.e. different ways in which the counters can decrease. For 
example, one can define that either a counter can only spontaneously decrease by 1, or 
it can only become zero, or it can change to any smaller value. All these different ways 
are described by different lossiness relations (see Section 2). 

The addition of lossiness to counter machines weakens their computational power. 
Some types of lossy counter machines (with certain lossiness relations) are not Turing- 
powerful, since reachability and termination are decidable for them. Since lossy counter 
machines are weaker than normal counter machines, any undecidability result for lossy 
counter machines is particularly interesting. 

The main result of this paper is that structural termination (termination for every 
input) is undecidable for every type of lossy counter machine (i.e. for every lossiness 
relation). 

This result can be applied to prove the undecidability of many problems. To prove 
the undecidability of a problem X, it suffices to choose a suitable lossiness relation L 
and reduce the structural termination problem for lossy counter machines with lossiness 
relation L to the problem X. The important and nice point here is that problem X does 
not need to simulate a counter machine perfectly. Instead, it suffices if X can simulate a 
counter machine imperfectly, by simulating only a lossy counter machine. Furthermore, 
one can choose the right type of imperfection (lossiness) by choosing the lossiness 
relation L. 
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Thus lossy counter machines can be used as a general tool to prove the undecidability 
of problems. Firstly, they can be used to prove new undecidability results, and secondly 
they can be used to give more elegant, simpler and much shorter proofs of existing results 
(see Section 5). 



2 Definitions 



Definition 1. A n-counter machine [19] M is described by a finite set of states Q, 
an initial state qo G Q, a final state accept G Q, n counters ci , . . . , c„ and a fi- 
nite set of instructions of the form (q : Ci := Ci + l;goto g') or (g : If Ci = 

0 then goto g' else a := Ci — 1; goto g") where i G {1, . . . , n} and q, q' , q" G Q. A 
configuration of M is described by a tuple (g, rn-i, . . . , m„) where q G Q and mj G IN 
is the content of the counter Ci (1 < i < n). The size of a configuration is defined by 
size{{q, mi, . . . , mn)) '■= Yl^=i possible computation steps are defined by 

1. (g,TOi, . . . ,m„) -)> (g',TOi, . . . .m* + 1, . . . ,m„) 
if there is an instruction (g : Ci := Ci + 1; goto g'). 

2. (g, mi, . . . , m„) -G (g', mi, . . . , m„) if there is an instruction (g : If c, = 

0 then goto g'else Ci := Ci — 1; goto q") and mi = 0. 

3. (g, mi, . . . , m„) -G (g", mi, . . . , m^ — 1, . . . , m„) if there is an instruction 
(g : If Ci = 0 then goto g'else Ci := Cj — 1; goto g") and mi > 0. 

A run of a counter machine is a (possibly infinite) sequence of configurations 
So, si, . . . with So — t Si — >■ S 2 — t S 3 — Lossiness relations describe spontaneous 
changes in the configurations of lossy counter machines. 

Definition 2. Let A (for ‘sum’) be a relation on configurations of n-counter machines 
(g,mi,...,m„) 4 (f ,m[, . . . ,m'J (g, mi, . . . , m„) = (g', m'^, . . . , m'„) V 

n n 

q = q' A ^ mi > ^ m' 

1 i—1 

This relation means that either nothing is changed or the sum of all counters strictly 
decreases. Let id be the identity relation. A relation 4 is a lossiness relation iff id C 

I S 

— >■ C — >•. A lossy counter machine (LCM) is given by a counter machine M and a 
lossiness relation 4. Let -G be the normal transition relation of M. The lossy transition 

relation =y- of the LCM is defined by si S 2 zls'i, s^. si -G s'l -G s '2 S 2 . An 

arbitrary lossy counter machine is a lossy counter machine with an arbitrary ( unspecified ) 
lossiness relation. The following relations are lossiness relations: 

Perfect The relation id is a lossiness relation. Thus arbitrary lossy counter machines 
subsume normal counter machines. 

cl cl 

Classic Lossiness The classic lossiness relation — >■ is defined by (g,mi, . . . ,m„) -G 
(g', m '^, . . . , m'ff) g = g' A Vi. mi > m'. Here the contents of the counters 

can become any smaller value. A relation 4 is called a subclassic lossiness relation 

iff id c4c4. 
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Bounded Lossiness A counter can loose at most a; G IN before and after every com- 
putation step. Here the lossiness relation is defined by (g, mi, . . . , m„) 

{q' , m'l, . . . , m'„) q = q' A Vi. mi> m' > maa;{0, rrii — x}. Note that is 

a subclassic lossiness relation. 

Reset Lossiness If a counter is tested for zero, then it can suddenly become zero. The 

lossiness relation ^ is defined as follows: {q, mi, . . . , m„) ^ {q', m{, . . . , mjj) 
iff q = q' and for all i either = m[ or m' = 0 and there is an instruction 

{q : If Cj = 0 then goto q' else c, := Ci — 1; goto q"). Note that — >■ is subclassic. 

The definition of these lossiness relations carries over to other models like Petri nets 
[21 ], where places are considered instead of counters. 

Definition 3. For any arbitrary lossy n-counter machine and any configuration s let 
runs (s) be the set of runs that start at configuration s. (There can be more than one run 
if the counter machine is nondeterministic or lossy. ) Let runs‘^(s) be the set of infinite 
runs that start at configuration s. A run r = {(q^,m\, . . . ,m^)}“Q G runs‘^(.s) is 
space-bounded iff 3c G IN. Vi. runs[[{s) be the space-bounded 

infinite runs that start at s. An (arbitrary lossy) n-counter machine M is 

zero-initializing iff in the initial state go it first sets all counters to 0. 
space-bounded iff the space used by M is bounded by a constant c. 3c G IN.Vr G 
runs((qo, 0, . . . , 0)).Vs G r. size(s) < c 

input-bounded iff in every run from any configuration the size of every reached config- 
uration is bounded by the input. \/s.\/r G runs(s).\/s' G r. size(s') < size(s) 
strongly-cyclic iff every infinite run from any configuration visits the initial state go 
infinitely often. Vg G Q, m\, . . . , m„ G IN.Vr G runs^({q, mi, . . . , m„)). 

3m [,. . . , m^ G IN. (go, m'l, . . . , m'„) G r. 

bounded-strongly-cyclic iff every space-bounded infinite run from any configuration 
visits the initial state go infinitely often. Vg G Q, mi, . . . , mn G IN. 

Vr G rMns^((g,mi, . . . ,m„)).3mi, . . . ,m(^ G IN. (go,m'i, . . . ,m'„) G r 

If M is input-bounded then it is also space-bounded. If M is strongly-cyclic then it is 
also bounded-strongly-cyclic. If M is input-bounded and bounded-strongly-cyclic then 
it is also strongly-cyclic. 

3 Decidable Properties 

Since arbitrary LCM subsume normal counter machines, nothing is decidable for them. 
However, some problems are decidable for classic LCM (with the classic lossiness 
relation). They are not Turing-powerful. The following results in this section are special 
cases of positive decidability results in [4,5,2]. 

Lemma 1. Let M be a classic LCM and s a configuration of M. The set pre*(s) := 
{s' I s' =1>* s} of predecessors of s is effectively constructible. 

Theorem 1. Reachability is decidable for classic LCM. 

Lemma 2. Let M be a classic LCM with initial configuration sq. It is decidable if there 
is an infinite run that starts at so, i.e. if runs^ (sf) 0. 

Theorem 2. Termination is decidable for classic LCM. 

It has been shown in [4] that even model checking classic LCM with the temporal 
logics EF and EG (natural fragments of computation tree logic (CTL) [7, 1 0]) is decidable. 
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4 The Undecidability Result 

We show that structural termination is undecidable for LCM for every lossiness relation. 
We start with the problem CM, which was shown to be undecidable by Minsky [19]. 
CM 

Instance: A 2-counter machine M with initial state q^. 

Question: Does M accept (go 7 0, 0) ? 

BSC-ZI-CM“ 

Instance: A bounded-strongly-cyclic, zero-initializing 3-counter machine M with ini- 
tial state go • 

Question: Does M have an infinite space-bounded run from (go, 0, 0, 0), 
i.e. rans^((go,0,0,0)) 0 ? 

Lemma 3. BSC-ZI-CM'^ is undecidable. 

Proof. We reduce CM to BSC-ZI-CM^ . Let M be a 2-counter machine with initial state 
go. We construct a 3-counter machine M' as follows: First M' sets all three counters to 

0. Then it does the same as M, except that after every instruction it increases the third 

counter C3 by 1. Every instruction of M of the form (g : Cj := -f 1; goto g') with 
(1 < i < 2) is replaced by g : Ci := Cj + 1; goto g2 and g2 : C3 := C3 + 1; goto g', where 
g2 is a new state. Every instruction of the form (g : If = 0 then goto g' else Ci := 
Ci — l;goto g") with (1 < f < 2) is replaced by three instructions: g : If Cj = 
0 then goto g2 else a := q - 1; goto g3, g2 : C3 := C3 -f 1; goto g', : C3 := 

C3 + 1; goto < 1 ” where g2, gs are new states. 

Einally, we replace the accepting state accept of M by the initial state q'^ of M' , 

1. e. we replace every instruction (goto accept) by (goto q'o). M' is zero-initializing by 
definition. M' is bounded-strongly-cyclic, because C3 is increased after every instruction 
and only set to zero at the initial state g(,. 

=:> If M is a positive instance of CM then it has an accepting run from (go, 0, 0). This 
run has finite length and is therefore space-bounded. Then M' has an infinite space- 
bounded cyclic run that starts at (g(,, 0,0,0). Thus M' is a positive instance of 
BSC-ZI-CM“. 

<:= If M' is a positive instance of BSC-ZI-CM^ then there exists an infinite space- 
bounded run that starts at the configuration (gg, 0, 0, 0). By the construction of M' 
this run contains an accepting run of M from the configuration (go, 0, 0). Thus M 
is a positive instance of CM. □ 

3nLCM‘^ 

Instance: A strongly-cyclic, input-bounded 4-counter LCM M with initial state go. 
Question: Does there exist an n S IN s.t. runs‘^{{qo, 0, 0, 0, n)) yf 0 ? 

Theorem 3. 3nLCM‘^ is undecidable for every lossiness relation. 

Proof. We reduce BSC-ZI-CM^ to 3nLCM“ with any lossiness relation Eor any 
bounded-strongly-cyclic, zero-initializing 3-counter machine M we construct a 
strongly-cyclic, input-bounded lossy 4-counter machine M' with initial state q^ and 

lossiness relation -4^ as follows: The 4-th counter C4 holds the ‘capacity’. In every 
operation it is changed in a way s.t. the sum of all counters never increases. (More 
exactly, the sum of all counters can increase by 1, but only if it was decreased by 1 in 
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the previous step.) Every instruction of M of the form {q : Ci := Ci + 1; goto q') with 
(1 < i < 3) is replaced by two instructions: q : If C4 = 0 then goto fail else C4 := 
C4 — 1; goto <72, Q2 ■ Ci := Ci + 1; goto g', where /azZ is a special final state and <72 is anew 
state. Every instruction of the form (q : If Cj = 0 then goto q' else c, := Ci — 1; goto q”) 
with (1 < i < 3) is replaced by two instructions: g : If = 0 then goto q' else Cj := 
Ci — 1; goto (72) 92 : C4 := C4 + 1; goto q" , where (72 is a new state. 

M' is bounded-strongly-cyclic, because M is bounded-strongly-cyclic. M' is input- 
bounded, because every run from a configuration {q, mi, . . . , mi) is space-bounded by 
mi + m 2 + m3 + mi. Thus M' is also strongly-cyclic. 

If M is a positive instance of BSC-ZI-CM^ then there exists a n G IN and an infinite 
run of M that starts at ((70, 0,0,0), visits q^ infinitely often and always satisfies 

Cl -f C2 -f C3 < n. Since id c\, there is also an infinite run of M' that starts at 
((7o, 0, 0, 0, n), visits qg infinitely often and always satisfies ci -f C2 -f C3 -f C4 < n. 
Thus M' is a positive instance of 3nLCM“ . 

<1= If M' is a positive instance of 3nLCM“ then there exists an n G IN s.t. there is an 
infinite run that starts at the configuration {q^, 0, 0, 0, n). This run is space-bounded, 
because it always satisfies ci -f C2 -f C3 -f C4 < n. By fhe consfrucfion of M' , the 
sum of all counters can only increase by 1 if it was decreased by 1 in the previous 
step. By the definition of lossiness (see Def. 2) we get the following: If lossiness 
occurs (when the contents of the counters spontaneously change) then this strictly 
and permanently decreases the sum of all counters. It follows that lossiness can only 
occur at most n times in this infinite run and the sum of all counters is bounded by n. 
Thus there is an infinite suffix of this run of M' where lossiness does not occur. Thus 
there exist q' G Q, m [, . . . , m^ G IN s.t. an infinite suffix of fhis run of M' wifhout 
lossiness sfarts ai{q' ,m'i, . . . ,m4). If follows fhat there is an infinite space-bounded 
run of M that starts at {q' , m [, . . . , m^). Since M is bounded-strongly-cyclic, this 
run must eventually visit qo. Thus there exist m", . . . , m'f G IN s.t. an infinite space- 
bounded run of M starts at ((70, m", . . . , m3). Since M is zero-initializing, there is 
an infinite space-bounded run of M that starts at ((70 , 0, 0, 0). Thus M is a positive 
instance of BSC-ZI-CM^. □ 

Note that this undecidability result even holds under the additional condition that 
the LCMs are strongly-cyclic and input-bounded. It follows immediately that model 
checking LCM with the temporal logics CTL (computation-tree logic [7,10]) and LTL 
(linear- time temporal logic [22]) is undecidable, since the question of 3nLCM“ can be 
expressed in these logics. There are two variants of the structural termination problem: 

Structterm-LCM, Variant 1 

Instance: A strongly-cyclic, input-bounded 4-counter LCM M with initial state q^. 
Question: Does M terminate for all inputs from q^ ? 

Eormally: Vrii, . . . , 714 G IN. runs^{{qo, ni,n 2 ,n^, ni)) = 11)1 
Structterm-LCM, Variant 2 

Instance: A strongly-cyclic, input-bounded 4-counter LCM M with initial state q^. 
Question: Does M terminate for all inputs from every control state q 1 

Eormally: Vrii, . . . , 714 G IN. V (7 G Q.runs‘^{{q,ni,n 2 ,ri 3 ,ni)) = ihl 

Theorem 4. Structural termination is undecidable for lossy counter machines. Both 
variants of STRUCTTERM-LCM are undecidable for every lossiness relation. 
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Proof. The proof of Theorem 3 carries over, because the LCM is strongly-cyclic and the 
3-CM in BSC-ZI-CM^ is zero-initializing. □ 

Space-Boundedness for LCM 

Instance: A strongly-cyclic 4-counter LCM M with initial configuration (go 7 0, 0, 0, 0) 
Question: Is M space-bounded ? 

Theorem 5. Space-boundedness for LCM is undecidable for all lossiness relations. 

Theorem 6. Structural space-boundedness for LCM is undecidable for every lossiness 
relation. 

Proof. The proof is similar to Theorem 4. An extra counter is used to count the length 
of the run. It is unbounded iff the run is infinite. All other counters are bounded. □ 



5 Applications 

5.1 Lossy Fifo-Channel Systems 

Fifo-channel systems are systems of finitely many finite-state processes that com- 
municate with each other by sending messages via unbounded fifo-channels (queues, 
buffers). In lossy fifo-channel systems these channels are lossy, i.e. they can sponta- 
neously loose (arbitrarily many) messages. This can be used to model communication 
via unreliable channels. While normal fifo-channel systems are Turing-powerful, some 
safety-properties are decidable for lossy fifo-channel systems [2,5,1]. However, live- 
ness properties are undecidable even for lossy fifo-channel systems. In [3] Abdulla and 
Jonsson showed the undecidability of the recurrent-state problem for lossy fifo-channel 
systems. This problem is if certain states of the system can be visited infinitely often. The 
undecidable core of the problem is essentially if there exists an initial configuration of a 
lossy fifo-channel system s.t. it has an infinite run. The undecidability proof in [3] was 
done by a long and complex reduction from a variant of Post’s correspondence problem 
(2-permutation PCP [23], which is (wrongly) called cyclic PCP in [3]). 

Lossy counter machines can be used to give a much simpler proof of this result. 
The lossiness of lossy fifo-channel systems is classic lossiness, i.e. the contents of a 
fifo-channel can change to any substring at any time. A lossy fifo-channel system can 
simulate a classic LCM (with some additional deadlocks) in the following way: Every 
lossy hfo-channel contains a string in X* (for some symbol X) and is used as a classic 
lossy counter. The only problem is the test for zero. We test the emptiness of a channel 
by adding a special symbol Y and removing it in the very next step. If it can be done then 
the channel is empty (or has become empty by lossiness). If this cannot be done, then the 
channel was not empty or the symbol Y was lost. In this case we get a deadlock. These 
additional deadlocks do not affect the existence of infinite runs, and thus the results of 
Section 4 carry over. Thus the problem 3nLCM“ (for the classic lossiness relation) can 
be reduced to the problem above for lossy fifo-channel systems and the undecidability 
follows immediately from Theorem 3. 
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5.2 Model Checking Lossy Basic Parallel Processes 

Petri nets [21] (also described as ‘vector addition systems’ in a different framework) are 
a widely known formalism used to model concurrent systems. They can also be seen as 
counter machines without the ability to test for zero, and are not Turing-powerful, since 
the reachability problem is decidable for them [17], Basic Parallel Processes correspond 
to communication-free nets, the (very weak) subclass of labeled Petri nets where every 
transition has exactly one place in its preset. They have been studied intensively in the 
framework of model checking and semantic equivalences (e.g. [12,18,6,15,20]). 

An instance of the model checking problem is given hy a system S (e.g. a counter ma- 
chine, Petri net, pushdown automaton,. . . ) and a temporal logic formula ip. The question 
is if the system S has the properties described by p, denoted S \= p. 

The branching-time temporal logics EF, EG and EG;j are defined as extensions of 
Hennessy-Milner Logic [13,14,10] by the operators EF, EG and EG^^, respectively. 
s \= EFp iff there exists an s' s.t. s A- s' and s' ^ sq |= EG^^p iff there exists 
an infinite run sq — >■ si — >■ S 2 — • s.t. Vi. Si ^ p. EG is similar, except that 
it also includes finite runs that end in a deadlock. Alternatively, EF and EG can be 
seen as fragments of computation-tree logic (GTE [7,10]), since EFp = trueU p and 
EGp = pwU false. 

Model checking Petri nets with the logic EF is undecidable, but model checking 
Basic Parallel Processes with EF is PSPAGE -compXeXe [18]. Model checking Basic 
Parallel Processes with EG is undecidable [12]. It is different for lossy systems: By 
induction on the nesting-depth of the operators EF, EG and EG^j, and constructions 
similar to the ones in Lemma 1 and Lemma 2, it can he shown that model checking 
classic LCM with the logics EF, EG and EG^^ is decidable. Thus it is also decidable for 
classical lossy Petri nets and classical lossy Basic Parallel Processes (see [4]). 

However, model checking lossy Basic Parallel Processes with nested EF and EGIEG^ 
operators is still undecidable for every subclassic lossiness relation. This is quite surpris- 
ing, since lossy Basic Parallel Processes are an extremely weak model of infinite-state 
concurrent systems and the temporal logic used is very weak as well. 

Theorem 7. Model checking lossy Basic Parallel Processes ( with any subclassic 
lossiness relation) with formulae of the form EF EGuj'P, where is a Hennessy-Milner 
Logic formula, is undecidable. 

Proof. Esparza and Kiehn showed in [12] that for every counter machine M (with all 
counters initially 0) a Basic Parallel Processes P and a Hennessy-Milner Logic formula 
p can be constructed s.t. M does not halt iff P ^ EG^ip. The construction carries over 
to subclassic LCM and subclassic lossy Basic Parallel Processes. The control-states of 
the counter machine are modeled by special places of the Basic Parallel Processes. In 
every infinite run that satisfies p exactly one of these places is marked at any time. 

We reduce 3nLCM“ to the model checking problem. Let M be a subclassic LCM. Let 
P be the corresponding Basic Parallel Processes as in [12] and let p be the corresponding 
Hennessy-Milner Logic formula as in [ 1 2] . We use the same subclassic lossiness relation 
on M and on P. P stores the contents of the 4-th counter in a place Y. Thus P||y” 
corresponds to the configuration of M with n in the 4-th counter (and 0 in the others). 

We define a new initial state X and transitions X A X\\Y and X \ P, where a and b 
do not occur in P. Let <P := p A ^{h)true. Then M is a positive instance of 3nLCM“ 
iff X \= EF EGi^d>. The result follows from Theorem 3. □ 
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For Petri nets and Basic Parallel Processes, the meaning of Hennessy-Milner Logic 
formulae can be expressed by boolean combinations of constraints of the form p > k 
(at least k tokens on place p). Thus the results also hold if boolean combinations of such 
constraints are used instead of Hennessy-Milner Logic formulae. Another consequence 
of Theorem 7 is that model checking lossy Petri nets with CTL is undecidable. 



5.3 Reset/Transfer Petri Nets 

Reset Petri nets are an extension of Petri nets by the addition of reset-arcs. A reset-arc 
between a transition and a place has the effect that, when the transition hres, all tokens 
are removed from this place, i.e. it is reset to zero. Transfer nets and transfer arcs are 
defined similarly, except that all tokens on this place are moved to some different place. 
It was shown in [8] that termination is decidable for ‘Reset Post G-nets’, a more general 
extension of Petri nets that subsumes reset nets and transfer nets. (For normal Petri nets 
termination is EXPSPACE-complete [24]). While boundedness is trivially decidable 
for transfer nets, the same question for reset nets was open for some time (and even a 
wrong decidability proof was published). Finally, it was shown in [8,9] that boundedness 
(and structural boundedness) is undecidable for reset Petri nets. The proof in [8] was 
done by a complex reduction from Hilbert’s 10th problem (a simpler proof was later 
given in [9]). 

Here we generalize these results by using lossy counter machines. This also gives a 
unified framework and considerably simplihes the proofs. 

Lemma 4. Reset Petri nets can simulate lossy counter machines with reset-lossiness. 

Theorem 8. Structural termination, boundedness and structural boundedness are un- 
decidable for lossy reset Petri nets with every subclassic lossiness relation. 

Proof. It follows from Lemma 4 that a lossy reset Petri net with subclassic lossiness 

I I rl 

relation — ^ can simulate a lossy counter machine with lossiness relation — U — The 
results follow from Theorem 4, Theorem 5 and Theorem 6. □ 

The undecidability result on structural termination carries over to transfer nets (in- 
stead of a reset the tokens are moved to a special ‘dead’ place), but the others don’t. 
Note that for normal Petri nets structural termination and structural boundedness can be 
decided in polynomial time (just check if there is a positive linear combination of effects 
of transitions). Theorem 7 and Theorem 8 also hold for arbitrary lossiness relations, but 
this requires an additional argument. The main point is that Petri nets (unlike counter 
machines) can increase a place/counter and decrease another in the same step. 



5.4 Parameterized Problems 

We consider verification problems for systems whose dehnition includes a parameter 
n G N. Intuitively, n can be seen as the size of the system. Examples are 

- Systems of n indistinguishable communicating finite-state processes. 

- Systems of communicating pushdown automata with n-bounded stack. 

- Systems of (a hxed number of) processes who communicate through (lossy) buffers 
or queues of size n. 
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Let P{n) be such a system with parameter n. For every fixed n, P{n) is a system 
with finitely many states and thus (almost) every verification problem is decidable for 
it. So the problem P{n) ^ tP is decidable for any temporal logic formula <P from 
any reasonable temporal logic, e.g. modal /i-calculus [16] or monadic second-order 
theory. The parameterized verification problem is if a property holds independently of 
the parameter n, i.e. for any size. Formally, the question is if for given P and ‘P we 
have Vn G N. P(n) ]= ^ (or -i3n G N. P(n) ]= -■<?). Many of these parameterized 
problems are undecidable by the following meta-theorem. 

Theorem 9. A parameterized verification problem is undecidable if it satisfies the fol- 
lowing conditions: 

1. It can encode an n-space-bounded lossy counter machine (for some lossiness rela- 
tion) in such a way that P(n) corresponds to the initial configuration with n in one 
counter and 0 in the others. 

2. It can check for the existence of an infinite run. 

Proof. By a reduction of 3nLCM“ and Theorem 3. The important point is that in the 
problem 3nLCM“ one can require that the LCM is input-bounded. □ 

The technique of Theorem 9 is used in [1 1] to show the undecidability of the model 
checking problem for linear-time temporal logic (LTL) and broadcast communication 
protocols. These are systems of n indistinguishable communicating finite-state processes 
where a ‘broadcast’ by one process can affect all other n—\ processes. Such a broadcast 
can be used to set a simulated counter to zero. However, there is no test for zero. One 

vl 

reduces 3nLCM“ with lossiness relation -G to the model checking problem. In the 
same way, similar results can be proved for parameterized problems about systems with 
bounded buffers, stacks, etc. 



6 Conclusion 

Lossy counter machines can be used as a general tool to show the undecidability of many 
problems. It provides a unified way of reasoning abouf many quife differenf classes 
of systems. For example the recurrent-state problem for lossy fifo-channel systems, 
the boundedness problem for reset Petri nets and the fairness problem for broadcast 
communication protocols were previously thought to be completely unrelated. Yet lossy 
counter machines show that the principles behind their undecidability are the same. 
Moreover, the undecidability proofs for lossy counter machines are very short and much 
simpler than previous proofs of weaker results [3,8]. 

Lossy counter machines have also been used in this paper to show that even for very 
weak temporal logics and extremely weak models of infinite-state concurrent systems, 
the model checking problem is undecidable (see Subsection 5.2). We expect that many 
more problems can be shown to be undecidable with the help of lossy counter machines, 
especially in the area of parameterized problems (see Subsection 5.4). 
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Abstract. The main result of the paper is the reduction of the problem 
of satisfiability of equations in free groups to the satisfiability of equations 
in free semigroups with anti-involution (SGA), by a non-deterministic 
polynomial time transformation. 

A free SGA is essentially the set of words over a given alphabet plus an 
operator which reverses words. We study equations in free SGA, general- 
izing several results known for equations in free semigroups, among them 
that the exponent of periodicity of a minimal solution of an equation E 
in free SGA is bounded by 

1 Introduction 

The study of the problem of solving equations in free SGA (unification in free 
SGA) and its computational complexity is a problem closely related to the prob- 
lem of solving equations in free semigroups and in free groups, which lately have 
attracted much attention of the theoretical computer science community [3], [12], 
[13], [14]. 

Free semigroups with anti-involution is a structure which lies in between that 
of free semigroups and free groups. Besides the relationship with semigroups and 
groups, the axioms defining SGA show up in several important theories, like 
algebras of binary relations, transpose in matrices, inverse semigroups. 

The problem of solving equations in free semigroups was proven to be decid- 
able by Makanin in 1976 in a long paper [10] . Some years later, in 1982, again 
Makanin proved that solving equations in free groups was a decidable problem 
[11]. The technique used was similar to that of the first paper, although the 
details are much more involved. He reduced equations in free groups to solving 
equations in free SGA with special properties (‘non contractible’), and showed 
decidability for equation of this type. For free SGA (without any further condi- 
tion) the decidability of the problem of satisfiability of equations is still open, 
although we conjecture it is decidable. 

Both of Makanin’s algorithms have received very much attention. The enu- 
meration of all unifiers was done by Jaffar for semigroups [6] and by Razborov 
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for groups [15]. Then, the complexity has become the main issue. Several authors 
have analyzed the complexity of Makanin’s algorithm for semigroups [6], [16], [1], 
being EXPSPACE the best upper-bound so far [3]. Very recently Plandowski, 
without using Makanin’s algorithm, presented an upper-bound of PSPACE for 
the problem of satisfiability of equations in free semigroups [14]. On the other 
hand, the analysis of the complexity of Makanin’s algorithm for groups was done 
by Koscielski and Pacholski [8], who showed that it is not primitive recursive. 

With respect to lower bounds, the only known lower bound for both problems 
is NP-hard, which seems to be weak for the case of free groups. It is easy to see 
that this lower bound works for the case of free SGA as well. 

The main result of this paper is the reduction of equations in free groups to 
equations in free SGA (Theorem 9 and Gorollary 10). This is achieved by gen- 
eralizing to SGA several known results for semigroups, using some of Makanin’s 
results in [11], and proving a result that links these results (Proposition 3). 
Although we do not use it here, we show that the standard bounds on the expo- 
nent of periodicity of minimal solutions to word equations also hold with minor 
modifications in the case of free SGA (Theorem 5) . 

For concepts of word combinatorics we will follow the notation of [9]. By e 
we denote the empty word. 

2 Equations in Free SGA 

A semigroup with anti-involution (SGA) is an algebra with a binary associa- 
tive operation (written as concatenation) and a unary operation ( )“^ with the 
equational axioms 

(xy)z = x{yz), (xy)“^ = x~^~^ = x. (1) 

A free semigroup with anti-involution is an initial algebra for this variety. It is 
not difficult to check that for a given alphabet C, the set of words over C U C~^ 
together with the operator ( )~^, which reverses a word and changes every letter 
to its twin (e.g. a to a~^ and conversely) is a free algebra for SGA over A. 

Equations and Solutions. Let C and V be two disjoint alphabets of constants 
and variables respectively. Denote by C~^ = {c~^ : c G C}. Similarly for V~^. 
An equation E in free SGA with constants C and variables V is a pair {w\,W 2 ) of 
words over the alphabet A = CUG”^ U VU The number \E\ = |rci|-l-|r(; 2 | is 
the length of the equation E and \E\y will denote the number of occurrences of 
variables in E. These equations are also known as equations in a paired alphabet. 

A map S : V — >■ {C U C~^)* can be uniquely extended to a SGA- 
homomorphism S : A* — >■ {C U C~^)* by defining S'(c) = c for c G G and 
S'(u“^) = (S'(u))“^ for M G G U V. We will use the same symbol S for the map 
S and the SGA-homomorphism S. A solution S of the equation E = (wi,W 2 ) 
is (the unique SGA-homomorphism defined by) a map S : V — ^ (G U G“^)* 
such that S{wi) = S{w 2 ). The length of the solution S is |S'(rci)|. By S{E) 
we denote the word S'(rci) (which is the same as S{w 2 )). Each occurrence of 
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a symbol u € A in E with S{u) ^ e determines a unique factor in S{E), say 
S{E)[i,j], which we will denote by S{u,i,j) and call simply an image of u in 
S{E). 

The Equivalence Relation (S,E). Let S' be a solution of E and P be the set of 
positions of S{E). Define the binary relation {S,Ey in P x P as follows: given 
positions p,q G P, p{S, E)'q if and only if one of the following hold: 

1. p = i + k and q = i' + k, where S{x,i,j) and S{x,i',f) are images of x in 
S{E) and 0 < A: < |S(x)|. 

2. p = i + k and q = j' — k, where S(x,i, j) and S(x~^,i', j') are images of x 
and x-i in S{E) and 0 < A: < |S(a:)|. 

Then define (S,E) as the transitive closure of {S,E)'. Observe that (S,E) is an 
equivalence relation. 

Contractible Words. A word w G A* is called non- contractible if for every u & A 
the word w contains neither the factor uu~^ nor u~^u. An equation {w\,W 2 ) is 
called non-contractible if both wi and W 2 are non-contractible. A solution S to 
an equation E is called non-contractible if for every variable x which occurs in 
E, the word S{x) is non-contractible. 

Boundaries and Superpositions. Given a word w G A*, we define a boundary of 
w as a pair of consecutive positions (p,p+ 1) in w. We will write simply the 
subindex denoting the corresponding word. By extension, we define i(w) = 0^ 
and f(w) = |ru|u), the initial and final boundaries respectively. Note that the 
boundaries of w have a natural linear order [p^ < iff p < g as integers) . 

Given an equation E = {wi,W 2 ), a superposition (of the boundaries of the 
left and right hand sides) of P is a linear order < of the set of boundaries of rci 
and W 2 extending the natural orders of the boundaries of Wi and W 2 , such that 
z(wi) = i{w 2 ) and /(ici) = f{w 2 ) and possibly identifying some Pw^ and q^.^- 

Cuts and Witnesses. Given a superposition < of E = (wi,W 2 ), a cut is a bound- 
ary j of W 2 (resp. wi) such that j b for all boundaries b of wi (resp. IV 2 ). 
Hence a cut determines at least three symbols of E, namely W 2 [j], W 2 [j + 1] and 
Wi[i 1], where i is such that < (z + l)u,i in the linear order, see 

Figure 1. The triple of symbols {w 2 [j],W 2 [j + is called a witness of the 

cut. A superposition is called consistent if wi[i -I- 1] is a variable. 

Observe that every superposition gives rise to a system of equations (P, <), 
which codifies the constraints given by <, by adding the corresponding equations 
and variables x = x'y which the cuts determine. Also observe that every solution 
S of E determines a unique consistent superposition, denoted <s- Note finally 
that the cut j determines a boundary (r, r -|- 1) in S{E); if p < r < q, we say 
that the subword S'(P)[p, q] of S{E) contains the cut j. 

Lemma 1 Let E be an equation in free SCA. Then E has a solution if and only 
if{E,<) has a solution for some consistent superposition <. There are no more 
than |P|"^l-®lx consistent superpositions. 
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Fig. 1. The cut jw 



Proof. Obviously if for some consistent superposition <, {E, <) has a solution, 
then E has a solution. Conversely, if E has a solution S, consider the superpo- 
sition generated by S. 

As for the bound, let E = {wi,W 2 ) and write v for \E\v- First observe 
that if W 2 consists only of constants, then there are at most 1^21“ consistent 
superpositions. To get a consistent superposition in the general case, first insert 
each initial and final boundary of each variable in W 2 in the linear order of the 
boundaries of Wi (this can be done in at most \E\ + v ways). Then it rest to 
deal with the subwords of W 2 in between variables (hence consisting only of 
constants and of total length < \E\ — v). Summing up, there are no more than 
{\E\ + v)'^^{\E\ — vY < \E\^'" consistent superpositions. 

Lemma 2 (Compare Lemma 6, [12]) Assume S is a minimal (w.r.t. length) 
solution of E. Then 

1. For each subword w = S{E)[i,j] with licl > 1, there is an occurrence of w 
or w~^ which contains a cut of (E,<s). 

2. For each letter c = S'(if)[t] of S{E), there is an occurrence of c or c~^ in E. 

Proof. Let 1 < p < q < |S'(if)|. Suppose neither w = S{E)[p,q] nor w~^ have 
occurrences in S{E) which contain cuts. Consider the position p in S{E) and its 
(S', A)-equivalence class P, and define for each variable x occurring in E, 

S'{x) = the subsequence of some image S{x,i,j) of x consisting of 
all positions which are not in the set P. (i.e. “cut off” from S{x,i,j) 
the positions in P). 

It is not difficult to see that S' is well defined, i.e., it does not depend on the 
particular image S{x,i,j) of x chosen, and that S'{wi) = S'{w 2 ) (these facts 
follow from the definition of (S, P)-equivalence). Now, if P does not contain any 
images of constants of E, it is easy to see that S' is a solution of the equation E. 
But |S'(P)| < |S(P)|, which is impossible because S was assumed to be minimal. 

Hence, for each word w = S[p,q\, its first position must in the same {S,E)~ 
class of the position of the image of a constant c oi E.li p < q the right (resp. 
left) boundary of that constant is a cut in w (resp. w“^) which is neither initial 
nor final (check definition of (S, ^(-equivalence for S(P)[p-|- 1], etc.), and we are 
in case 1. If p = g we are in case 2. 
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Proposition 3 For each non- contractible equation E there is a finite list of 
systems of equations Ei, . . . , Ek such that the following conditions hold: 

1. E has a non- contractible solution if and only if one Ei has a solution. 

2. k< 

3. There is c > 0 constant such that \Ei\ < c\E\ and \Ei\v < c|£i|y for each 
i = 1, . . . , k. 

Proof. Let < be a consistent superposition of E, and let 



{xi,yi,Zi),. . . ,{Xr,yr,Zr) (2) 

be a list of those witnesses of the cuts of (E,<) for which at least one of the 
Xi,yi is a variable. Let 

D = {{c,d)G{CU C-i)2 -.c^d-^ Ad ^ c~^}, 
and define for each r-tuple {{ci,di))i, of pairs of D the system 

^ {{ci^di))i (-^5 — ) Xj^Cif {yi^ ^iUi) ■ ^ • 7 

Now, if S' is a non-contractible solution of {E, <) then S define a solution of 
some Ei, namely the one defined by the r-tuple defined by the elements (cj, df) = 
(S(cCi)[|S(a;i)|], S(j/i)[l]), for i = l,...,r. Note that because E and S are non- 
contractible, each (ci,di) is in D. 

On the other direction, suppose that S is a solution of some Ei. Then ob- 
viously S is a solution of {E,<). We only need to prove that the S{z) is non- 
contractible for all variables 2 occurring in E. Suppose some z has a factor cc~^, 
for c € C. Then by Lemma 2 there is an occurrence of cc~^ (its converse is 
the same) which contains a cut of {E, <). But because E is non-contractible, we 
must have that one of the terms in (2), say {xj,yj, zj), witnesses this occurrence, 
hence Xj = x'^c and yj = c~^y'j, which is impossible by the definition of the Efs. 

The bound in 2. follows by simple counting: observe that r < 2|if |y and \D\ < 
|Cp’’ < and the number k of systems is no bigger than the number 

of superpositions times \D\. For the bounds in 3. just sum the corresponding 
numbers of the new equations added. 

The following is an old observation of Hmelevskii [5] for free semigroups 
which extends easily to free SGA: 

Proposition 4 For each system of equations E in free SGA with generators C, 
there is an equation E in free SGA with generators C U c, c ^ (C U C~^), such 
that 

1. S is a solution of E if and only if S is a solution of E. 

2. \E\ < 4\E\ and \E\v = jriy. 



Moreover, if the equations in E are non-contractible, the E is non-contractible. 
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Proof. Let (vi,wi), . . . , (v„, Wn) the system of equations S. Define E as 

{viCV2C- ■ ■ CVnCV\C~^V2C~^ ■ ■ ■ C~^Vn, W\CW2C- ■ ■ CWnCWiC~^ W2C~^ ■ ■ ■ C~^Wn). 

Clearly E is non-contractible because so was each equation and c is a 

fresh letter. Also if S' is a solution of E, obviously it is a solution of E. Conversely, 
if S is a solution of E, then 



|S(t;iCU2C- • • CV„)| = |S(uiC ^V2C 



hence 

|S(uiCU 2 C- • • CVn)\ = \S{wiCW2C- ■ ■ CW„)|, 

and the same for the second pair of expressions with c~^. Now it is easy to show 
that S{vi) = S{wi) for all i: suppose not, for example |S(wi)| < |S(wi)|. Then 
S(wi)[|S(ui)| + 1] = c and S(tci)[|S(wi)| + 1] = c“^, impossible. Then argue the 
same for the rest. 

The bounds are simple calculations. 

The next result is a very important one, and follows from a straightforward 
generalization of the result in [7], where it is proved for semigroups. 

Theorem 5 Let E be an equation in free SGA. Then, the exponent of periodicity 
of a minimal solution of E is bounded by 



Proof. It is not worth reproducing here the ten-pages proof in [7] because the 
changes needed to generalize it to free SGA are minor ones. We will assume that 
the reader is familiar with the paper [7]. 

The proof there consist of two independent parts: (1) To obtain from the 
word equation E a linear Diophantine equation, and (2) To get good bound 
for it. We will sketch how to do step (1) for free SGA. The rest is completely 
identical. 

First, let us sketch how the system of linear equations is obtained from a 
word equation E. Let S' be a solution of E. Recall that a P-stable presentation 
of S(x), for a variable x, has the form 



S{x) = WoP^^WiP^^ . . .Wn-lP^''~^Wn. 

^From here, for a suitable P (which is the word that witnesses the exponent of 
periodicity of S{E)), a system of linear Diophantine equations LDp{E) is built, 
roughly speaking, by replacing the pLi by variables in the case of variables, 
plus some other pieces of data. Then it is proved that if S is a minimal solution 
of E, the solution is a minimal solution of LDp{E). 

For the case of free SGA, the are two key points to note. First, for the 
variables of the form x~^, the solution S'(a:“^) will have the following P“^-stable 
presentation (same P,Wi,p,i as before): 

s{x-^) = w-\p-^r--^w-^_,{p-^r-E..wf\p-^rw^\ 
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Second, note that P~^ is a subword of PP if and only if P is a subword of 
P~^P~^. Call a repeated occurrence of P in w, say w = uP^v, maximal, if P 
is neither the suffix of u nor a prefix of w. So it holds that maximal occurrences 
of P and P~^ in w either (1) do not overlap each other, or (2) overlap almost 
completely (exponents will differ at most by 1). 

In case (1), consider the system LDp{E')ULDp-i {E') (each one constructed 
exactly as in the case of word equations) where E' is the equation E where we 
consider the pairs of variables x~^,x as independent for the sake of building 
the system of linear Diophantine equations. And, of course, the variables x^^ 
obtained from the same Hi in S{x) and ^(x”^) are the same. 

In case (2), notice that P-stable and P“ ^-stable presentations for a variable 
X differ very little. So it is enough to consider LDp{E'), taking care of using for 
the P-presentation of S'(x“^) the same set of Diophantine variables (adding 1 
or —1 where it corresponds) used for the P-presentation of S{x). 

It must be proved then that if S' is a minimal solution of the equation in free 
SGA E, then the solution is a, minimal solution of the corresponding 

system of linear Diophantine equations defined as above. This can be proved 
easily with the help of Lemma 2. 

Finally, as for the the parameters of the system of Diophantine equations, 
observe that \E'\ = \E\, hence the only parameters that grow are the number of 
variables and equations, and by a factor of at most 2. So the asymptotic bound 
remains the same as for the case of E', which is 2® 

The last result concerning equations in free SGA we will prove follows from 
the trivial observation that every equation in free semigroups is an equation in 
free SGA. Moreover: 

Proposition 6 Let M he a free semigroup on the set of generators C , and N 
he a free SGA on the set of generators C, and E an equation in M. Then E is 
satisfiahle in M if and only if it is satisfiahle in N. 

Proof. An equation in free SGA which does no contain ( )“^ has a solution if 
and only if it has a solution which does not contain ( )“^. So the codification of 
equations in free semigroups into free SGA is straightforward: the same equation. 

We get immediately a lower bound for the problem of satisfiability of equa- 
tions in free SGA by using the corresponding result for the free semigroup case. 

Corollary 7 Satisfiability of equations in free SGA is NP-hard. 



3 Reducing the Problem of Satisfiability of Equations in 
Free Groups to Satisfiability of Equations in Free SGA 

A group is an algebra with a binary associative operation (written as concate- 
nation), a unary operation ( )“^, and a constant 1, with the axioms (1) plus 

xx“^ = 1, x“^x =1, lx = xl = 1. 



(3) 
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As in the case of free SGA, is not hard to see that the set of non-contractible 
words over CUC~^ plus the empty word, and the operations of composition and 
reverse suitable defined, is a free group with generators C. 

Equations in free groups. The formal concept of equation in free groups is almost 
exactly the same as that for free SGA, hence we will not repeat it here. The 
difference comes when speaking of solutions. A solution S of the equation E 
is (the unique group-homomorphism S : A — > (G U C~^)* defined by) a map 
S : V — >■ {CUC~^)* extended by defining S'(c) = c for each cG C and s{w~^) = 
{S{w))~^, which satisfy S{wi) = S{w 2 ). Observe that the only difference with 
the case of SGA is that now we possibly have ‘simplifications’ of subexpressions 
of the form ww~^ or w~^w to 1, i.e. the use of the equations (3). 

Proposition 8 (Makanin, Lemma 1.1 in [11]) Eor any non- contractible 
equation E in the free group G with generators C we can construct a finite list 
of systems of non-contractible equations in the free SGA G' with 
generators G such that the following conditions are satisfied: 

1. E has a non-contractible solution in G if and only if k > 0 and some system 
Ej has a non-contractible solution in G' . 

2. There is c > 0 constant such that \Ei\ < \E\ -\- c|A|y and \Ei\v < c|A|y for 
each i = 1, . . . ,k- 

3. There is c > 0 constant such that k < (|A . 

Proof. This is essentially the proof in [11] with the bounds improved. Let E be 
the equation 

G 0 A 1 G 1 A 2 • • • = 1, (4) 

where Gi are non-contractible, v = \E\v, and Xi are meta- variables representing 
the actual variables in E. 

Let S' be a non-contractible solution of E. By a known result (see [11], p. 486), 
there is a set W of non-contractible words in the alphabet G, \W\ < 2v{2v -\- 1), 
such that each Gi and S{Xi) can be written as a concatenation of no more than 
2v words in W , and after replacement Equation (4) holds in the free group with 
generators W. 

Let Z be a set of 2v{2v -\- 1) fresh variables. Then choose words 
yo,xi,yi,x\, . . . ,Xv,yy € (ZUZ~^)*, each of length at most 2v, non-contractible, 
and define the system of equations 

1. Gj = yj, j = 0,...,v, 

2. Xj = Xj, j = l,...,v. 

Each such set of equations, for which Equation (4) holds in the free group with 
generators Z when replacing Gi and Xi by the corresponding words in (ZUZ~^)* , 
defines one system Ei. 

It is clear from the result mentioned earlier, that E has a solution if and 
only if there is some Ei which has a non-contractible solution. How many Ei are 
there? No more than [(2v(2v -\- 1))^’’]^”+^. 
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Theorem 9 For each equation E in a free group G with generators C there 
is a finite set Q of equations in a free semigroup with anti-involution G' with 
generators G U {ci,C 2 }, Ci,C 2 ^ G, such that the following hold: 

1. E is satisfiable in G if and only if one of the equations in Q is satisfiable in 
G'. 

2. There is c > 0 constant, such that for each E' G Q, it holds \E'\ < c|i?p. 

3. IQI < for c > 0 o constant. 

Proof. By Proposition 8, there is a list of systems of non-contractible equa- 
tions which are equivalent to E (w.r.t. non-contractible satisfiabil- 

ity). By Proposition 4, each such system Ej is equivalent (w.r.t. to satisfiability) 
to a non-contractible equation E' . Then, by Proposition 3, for each such non- 
contractible E' , there is a system of equations (now without the restriction of 
non-contractibility) . . . , such that E' has a non-contractible solution if 
and only if one of the Ei has a solution (not necessarily non-contractible). Fi- 
nally, by Proposition 4, for each system E' , we have an equation E" which have 
the same solutions (if any) of E' . So we have a finite set of equations (the E"'s) 
with the property that E is satisfiable in G if and only if one of the E" is 
satisfiable in G' . 

The bounds in 2. and 3. follow by easy calculations from the bounds in the 
corresponding results used above. 

Remark. It is not difficult to check that the set Q in the previous theorem can 
be generated non-deterministically in polynomial time. 

Corollary 10 Assume that fx is an upper bound for the deterministic TIME- 
complexity of the problem of satisfiability of equations in free SGA. Then 

max{/T(c|ifn, 

for c > 0 a constant, is an upper bound for the deterministic TIME-complexity 
of the problem of satisfiability of equations in free groups. 

4 Conclusions 

Our results show that solving equations in free SGA comprises the cases of free 
groups and free semigroups, the first with an exponential reduction (Theorem 
9), and the latter with a linear reduction (Proposition 6). This suggest that free 
SGA, due to its simplicity, is the ‘appropriate’ theory to study when seeking 
algorithms for solving equations in those theories. 

In a preliminary version of this paper we stated the following conjectures: 

1. Satisfiability of equations in free groups is PSPAGE-hard. 

2. Satisfiability of equations in free groups is in EXPTIME. 

3. Satisfiability of equations in free SGA is decidable. 

In the meantime the author proved that satisfiability of equations in free SGA is 
in PSPAGE, hence answering positively (2) and (3). Also independently, Diekert 
and Hagenah announced the solution of (3) [2]. 
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Abstract. We described here a construction on transducers that give a 
new conceptual proof for two classical decidability results on transducers: 
it is decidable whether a finite transducer realizes a functional relation, 
and whether a finite transducer realizes a sequential relation. A better 
complexity follows then for the two decision procedures. 



In this paper we give a new presentation and a conceptual proof for two 
classical decision results on finite transducers. 

Transducers are finite automata with input and output; they realize thus 
relations between words, the so-called rational relations. Eventhough they are a 
very simple model of machines that compute relations — they can be seen as 
2-tape 1-way Turing machines — most of the problems such as equivalence or 
intersection are easily shown to be equivalent to the Post Correspondence Prob- 
lem and thus undecidable. The situation is drastically different for transducers 
that are functional, that is, transducers that realize functions, and the above 
problems become then easily decidable. And this is of interest because of the 
following result. 

Theorem 1. [12] Functionality is a decidable property for finite transducers. 

Among the functional transducers, those which are deterministic in the in- 
put (they are called sequential) are probably the most interesting, both from a 
pratical and from a theoretical point of view: they correspond to machines that 
can really and easily be implemented. A rational function is sequential if it can 
be realized by a sequential transducer. Of course, a non sequential transducer 
may realize a sequential function and this occurrence is known to be decidable. 

Theorem 2. [7] Sequentiality is a decidable property for rational functions. 

The original proofs of these two theorems are based on what could be called a 
“pumping” principle, implying that a word which contradicts the property may 
be chosen of a bounded length, and providing thus directly decision procedures 
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of exponential complexity. Theorem 1 was published again in [4], with exactly 
the same proof, hence the same complexity. 

Later, it was proved that the functionality of a transducer can be decided 
in polynomial time, as a particular case of a result obtained by reduction to 
another decision problem on another class of automata ([10, Theorem 2]). 

With this communication, we shall see how a very natural construction per- 
formed on the square of the transducer yields a decision procedure for the two 
properties, that is, it can be read on the result of the construction whether the 
property holds or not. 

The size of the object constructed for deciding functionality is quadratic in 
the size of the considered transducer. In the case of sequentiality, one has to be 
more subtle for the constructed object may be too large. But it is shown that it 
can be decided in polynomial time whether this object has the desired property. 

Due to the short space available on the proceedings, the proofs of the results 
are omited here and will be published in a forthcoming paper. 

1 Preliminaries 

We basically follow the definitions and notation of [9,2] for automata. 

The set of words over a finite alphabet A, i.e. the free monoid over A, is 
denoted by A* . Its identity, or empty word is denoted by 1a* ■ 

An automaton A over a finite alphabet A, noted A = {Q, A, E, I,T), is a 
directed graph labelled by elements of A; Q is the set of vertices, called states, 
I C Q is the set of initial states, T C Q is the set of terminal states and 
AcQxAxQis the set of labelled edges called transitions. The automaton A 
is finite if Q is finite. 

The definition of automata as labelled graphs extends readily to automata 
over any monoid: an automaton A over M, noted A = {Q, M, E, I,T), is a 
directed graph the edges of which are labelled by elements of the monoid M. A 
computation is a path in the graph A; its label is the product of the label of 
its transitions. A computation is successful if it begins with an initial state and 
ends with a final state. The behaviour of A is the subset of M consisting of the 
labels of the successful computations of A. 

A state of A is said to be accessible if it belongs to a computation that begins 
with an initial state; it is useful if it belongs to a successful computation. The 
automaton A is trim if all of its states are useful. The accessible part and the 
useful part of a finite automaton A are easily computable from A. 

An automaton T = {Q, A* x B* , E, I,T) over a direct product A*xB* of two 
free monoids is called transducer from A* to B* . The behaviour of a transducer E 
is thus (the graph of) a relation a from A* into B*: a is said to be realized by T. 
A relation is rational {i.e. its graph is a rational subset of A* xB*) if and only 
if it is realized by a finite transducer. 

It is a slight generalization — that does not increase the generating power of 
the model — to consider transducers T = {Q , A* x B* , E , I ,T) where I and T 
are not subsets of Q {i.e. functions from Q into {0,1}) but functions from Q 
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into S* U 0 (the classical transducers are those for which the image of a state 
by / or T is either 0 or Is*). 

A transducer is said to be real-time if the label of every transition is a 
pair (a, v) where a is letter of A, the input of the transition, and v a word over B, 
the output of the transition, and if for any states p and q and any letter a there 
is at most one transition from p to q whose input is a. Using classical algorithms 
from automata theory, any transducer B can be transformed into a transducer 
that is real-time if T realizes a function ([9, Th. IX. 5.1], [2, Prop. III. 7.1]). 

If T = (Q,A*xB*,E,I,T) is a real-time transducer, the underlying input 
automaton of T is the automaton A over A obtained from T by forgetting the 
second component of the label of every transition and by replacing the functions / 
and T by their respective domains. The language recognized by A is the domain 
of the relation realized by B- 

We call sequential a transducer that is real-time, functional, and whose un- 
derlying input automaton is deterministic. A function a from A* into B* is 
sequential if it can be realized by a sequential transducer. It has to be ac- 
knowlegded that this is not the usual terminology, what we call “sequential” 
(transducers or functions) have been called “subsequentiaF since the seminal 
paper by Schiitzenberger [13] — cf. [2,5,7,8,11, etc. ]. There are good reasons for 
this change of terminology that has already been advocated by V. Bruyere and 
Ch. Reutenauer: “the word subsequential is unfortunate since these functions 
should be called simply sequentiaF ([5]). Someone has to make the first move. 

2 Squaring Automata and Ambiguity 

Before defining the square of a transducer, we recall what is the square of an 
automaton and how it can be used to decide whether an automaton is unam- 
biguous or not. A trim automaton A = (Q,A,E,BT) is unambiguous if any 
word it accepts is the label of a unique successful computation in A. 

Let A' = {Q' , A, E' , r ,T' ) and A!' = ( Q", A, E", B" ) be two automata 
on A. The Cartesian product of A' and A" is the automaton C defined by 

C = A'xA" = (Q'x Q", A, E, /' x T' x B" ) 
where E is the set of transitions defined by 

E = {{{p'y),a, {q', q")) \ (/, a, q') G E' and (p", a, q") G E"} . 

Let AxA= ( Q X Q, A, F, / X /, T X T ) be the Cartesian product of the au- 
tomaton A= {Q, A, E, BT) with itself; the set F of transitions is defined by: 

F = {((p, r), a, {q, s)) [ (p, a, q), (r, a, s) e E} . 

Let us call diagonal oi Ax A the sub-automaton V oi Ax A determined by the 
diagonal D oi QxQ, i.e. D = {{q,q) \ q G Q}, as set of states. The states and 
transitions of A and T> are in bijection, hence A and T> are equivalent. 
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Lemma 1. [3, Prop. IV. 1.6] A trim automaton A is unambiguous if and only 
if the trim part of Ax A is equal to T>. ■ 

Remark that as (un)ambiguity, determinism can also be described in terms 
of Cartesian square, by a simple rewording of the definition: a trim automaton A 
is deterministic if and only if the accessible part of Ax A is equal to T>. 

3 Product of an Automaton by an Action 

We recall now what is an action, how an action can be seen as an automaton, 
and what can be then defined as the product of a (normal) automaton by an 
action. We end this section with the definition of the specific action that will be 
used in the sequel. 

Actions. A (right) action of a monoid M on a set S' is a mapping 6 : SxM — >■ S 
which is consistent with the multiplication in M: 

\/s € S , € M S{s,1m) = s and 5{5{s,m),m') = 6{s,mm') . 

We write s ■ m rather than 5{s, m) when it causes no ambiguity. 

Actions as automata. An action 5 of M on a set S with sq as distinguished 
element may then be seen as an automaton on M (without terminal states): 

Gs = {S, M,E,so) 

is defined by the set of transitions E = {{s,m,s ■ m) | s G S , m G M}. 

Note that, as both S and M are usually infinite, the automaton Gs is “doubly” 
infinite: the set of states is infinite, and, for every state s, the set of transitions 
whose origin is s is infinite as well. 

Product of an automaton by an action. Let A = (Q,M,E,I,T) be a (finite 
trim) automaton on a monoid M and 6 an action of M on a (possibly infinite) 
set S. The product of A and Gs is the automaton on M: 

AxGs = {QxS,M,E,Ix{so},TxS) 

the transitions of which are defined by 

F = {((P. s),m,{q,s-m)) | s G S' , (p, m, q) G E} . 

We shall call product of A by S, and denote by AxS, the accessible part of AxGs- 
The projection on the first component induces a bijection between the tran- 
sitions of A whose origin is p and the transitions of A x <5 whose origin is (p, s) , 
for any p in Q and any (p, s) in A x <5. The following holds (by induction on the 
length of the computations): 

(p, s) ™ > {q, t) t=s-m . 

AxS 
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We call value of a state {p, s) of Al x (5 the element s of S. We shall say that the 
product Al X (5 itself is a valuation if the projection on the first component is a 
1-to-l mapping between the states of AlxJ and the states of A. 

Remark 1. Let us stress again the fact that AxS is the accessible part of AxQs- 
This makes possible that it may happen that Al x <5 is finite eventhough Qs is 
infinite (c/. Theorem 5). 

The “Advance or Delay” action. Let B* be a free monoid and let us denote 
by Hb the set Hb = (-B*xls.) U (1 b* xS*) U {0}. A mapping B*xB* — >• Hb 
is defined by: 

1b*) if r: is a prefix of u 

(1b* , u~^v) if M is a prefix of v 

0 otherwise 

Intuitively, il)(u,v) tells either how much the first component u is ahead of the 
second component v, or how much it is late, or if u and v are not prefixes of a 
common word. In particular, ^^{u, v) = (1b* x 1b* ) if? and only if, u = v. 

Lemma 2. The mapping u>b from Hb x {B* xB*) into Hb defined by: 

V(/,ff)eLfs\0 ujB{{f,g),{u,v)) =ip{fu,gv) and ujb(0, (u, v)) = 0 

is an action, which will be called the “Advance or Delay” (or “AD ”) action 
(relative to B* ) and will thus be denoted henceforth by a dot. ■ 

Remark 2. The transition monoid of wb is isomorphic to B* x B* if i? has 
at least two letters, to Z if it has only one letter. (We have denoted by 0 the 
absorbing element of Hb under lob in order to avoid confusion with 0, the 
identity element of the monoid Z). 



4 Deciding Functionality 



Let T = {Q, A* X B* , E, I,T) be a real-time trim transducer such that the 
output of every transition is a single word of B* — recall that this is a necessary 
condition for the relation realized by T to be a function. The transducer E is 
not functional if and only if there exist two distinct computations: 




with u'l u' 2 . . .u'n yf u'( u'f . . .u'f. There exists then at least one i such that u' yf 
u", and thus such that gr' yf q" . 

This implies, by projection on the first component, that the underlying input 
automaton A of T is ambiguous. But it may be the case that A is ambiguous 
and E still functional, as it is shown for instance with the transducer Q\ rep- 
resented on the top of Figure 1 (c/. [2]). We shall now carry on the method of 
Cartesian square of section 2 from automata to transducers. 
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Cartesian square of a real-time transducer. By definition, the Cartesian prod- 
uct of T by itself is the transducer Tx T from A* into B* xB*: 

TxT = {QxQ,A*x{B*xB*),F,IxI,TxT) 
whose transitions set F is defined by: 

i^ = {((p,r),(a, (u',u")),(g,s)) I {p,{a,u'),q) and {r, {a,u"), s) G E} . 

The underlying input automaton oi F xF is the square of the underlying 
input automaton A of F. If A is unambiguous, then F is functional, and the 
trim part of ^ x ^ is reduced to its diagonal. 

An effective characterization of functionality. The transducer T x T is an 
automaton on the monoid M = A* x {B* x B*) . We can consider that the AD 
action is an action of M on iJs, by forgetting the first component. We can thus 
make the product of TxT, or of any of its subautomata, by the AD action ojb. 






Fig. 1. Cartesian square of Qi, valued by the product with the action 
As the output alphabet has only one letter, Hy^y is identified with Z and the states 
are labelled by an integer. Labels of transitions are not shown: the input is always a 
and is kept implicit; an output of the form is coded by the integer n — m 

which is itself symbolised by the drawing of the arrow: a dotted arrow for 0, a simple 
solid arrow for +1, a double one for +2 and a bold one for +3; and the corresponding 
dashed arrows for the negative values. 
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Theorem 3. A transducer T from A* into B* is functional if and only if the 
product of the trim part U of the Cartesian square TxT by the AD action ojb 
is a valuation ofU such that the value of any final state is {1 b*, ^b*)- ■ 

Figure 1 shows the product of the Cartesian square of a transducer Qi by 
the AD action^. 

Let us note that if a. is the relation realized by T, the transducer obtained 
from Tx T by forgeting the first component is a transducer from B* into itself 
that realizes the composition product The conditon expressed may then 

seen as a condition for a ° a~^ being the identity, which is clearly a condition 
for the functionality of a. 

5 Deciding Sequentiality 

The original proof of Theorem 2 goes indeed in three steps: first, sequential func- 
tions are characterized by a property expressed by means of a distance function, 
then this property (on the function) is proved to be equivalent to a property on 
the transducer, and finally a pumping-lemma like procedure is given for deciding 
the latter property (c/. [7,2]). We shall see how the last two steps can be replaced 
by the computation of the product of the Cartesian square of the transducer by 
the AD action. We first recall the first step. 

5.1 A Quasi- Topological Characterization of Sequential Functions 

If / and g are two words, we denote hy f A g the longuest prefix common to / 
and g. The free monoid is then equipped with the prefix distance 

\ff,geA* dp{f,g) = \f\ + \g\-2\f Ag\ . 

In other words, \i f = h f and g = hg' with h = f Ag, then dp(/,g) = |/'|-l- jg'j. 

Definition 1. A function a: A* — >■ B* , is said to be uniformly diverging^ 
if for every integer n there exists an integer N which is greater than the prefix 
distance of the images by a of any two words (in the domain of a) whose prefix 
distance is smaller than n, i.e. 

Vn G N , 3N G N , V/, g G Dorn a dp(/, g) ^ n =A dp(/o:, ga) < N . 

Theorem 4. [7,13] A rational function is sequential if, and only if it is uni- 
formly diverging. 

Remark 3. The characterization of sequential functions by uniform divergence 
holds in the larger class of functions whose inverse preserves rationality. This is 
a generalization of a theorem of Ginsburg and Rose due to Choffrut, a much 
stronger result, the full strength of which will not be of use here (c/. [5,8]). 

^ It turns out that, in this case, the trim part is equal to the whole square. 

^ After [7] and [2], the usual terminology is “function with bounded variation'’ . We 
rather avoid an expression that is already used, with an other meaning, in other 
parts of mathematics. 
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5.2 An Effective Characterization of Sequential Functions 

Theorem 5. A transducer T realizes a sequential function if, and only if the 
product of the accessible part V ofTxT by the AD action lob 

i) is finite; 

ii) has the property that if a state with value 0 belongs to a cycle in V, then 

the label of that cycle is {1 b*, ^b*)- ■ 

The parallel between automata and transducers is now to be emphasized. 
Unambiguous (resp. deterministic) automata are characterized by a condition on 
the trim (resp. accessible) part of the Cartesian square of the automaton whereas 
functional transducers (resp. transducers that realize sequential functions) are 
characterized by a condition on the product hy uib of the trim (resp. accessible) 
part of the Cartesian square of the transducer. 

Figure 2 shows two cases where the function is sequential: in (a) since the 
accessible part of the product is finite and no state has value 0 ; in (b) since the 
accessible part of the product is finite as well and the states whose value is 0 all 
belong to a cycle every transition of which is labelled by (Is*, Is*). 




Figure 3 shows two cases where the function is not sequential: in (a) since 
the accessible part of the product is infinite; in (b) since although the accessible 
part of the product is finite some states whose value is 0 belong to a cycle whose 
label is different from (1_b*, 1b*)- 

The following lemma is the key to the proof of Theorem 5 as well as to its 
effectivity. 

Lemma 3. Let w = {1b*, z) be in Hb \ 0 and {u,v) in B*xB* \ (1b*j 1b*)- 
Then the set {w - (m, w)" | n G N} zs finite and does not contain 0 if, and only 
if, u and v are congugate words and z is a prefix of a power of u. ■ 
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Remark 4- The original proof of Theorem 2 by Ch. Choffrut goes by the 
definition of the so-called twinning property {cf. [2, p. 128]). It is not difficult 
to check that two states p and q of a real-time transducer 'T are (non trivially) 
twinned when: i) (p, g) is accessible in TxT; ii) (p, g) belongs to a cycle 

in V every transition of which is not labelled by (l_B*,l_B*)i iii) (P: ?) has 
not the value 0 in the product of V by ws . 

It is then shown that a transducer realizes a sequential function if, and only 
if, every pair of its states has the twinning property. 




(a) (b) 

Fig. 3. Two transducers that realize non sequential functions. 



6 The Complexity Issue 

The “size” of an automaton A (on a free monoid A*) is measured by the num- 
ber m of transitions. (The size |A| = k oi the (input) alphabet is seen as a 
constant.) The size of a transducer T will be measured by the sum of the sizes 
of its transitions where the size of a transition (p, (u,v),q) is the length juuj. It 
is denoted by |T|. 

The size of the transducer T x T is |Tp and the complexity to build it is 
proportional to that size. The complexity of determining the trim part as well 
as the accessible part is linear in the size of the transducer. 

Deciding whether the product of the trim part U of TxT by the AD action lob 
is a valuation of U (and if the value of any final state is (l^* , 1 _b*)) is again linear 
in the size oiU. Hence deciding whether a transducer T is functional is quadratic 
in the size of the transducer. Note that the same complexity is also established 
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The complexity of a decision procedure for the sequentiality of a function, 
based on Theorem 5, is polynomial. However, this is less straightforward to 
establish than functionality, for the size of the product Vxlob may be exponential. 

One first checks whether the label of every cycle in V is of the form (u, v) 
with |m| = |u|. It suffices to check it on a base of simple cycles and this can 
be done by a deep-first search in V. Let us call true cycle a cycle which is not 
labelled by (Is*, 1b*) and let >V be the subautomaton of V consisting of states 
from which a true cycle is accessible. By Theorem 5, if suffices to consider the 
product Wxwb- This product may still be of exponential size. However one does 
not construct it entirely. For every state of W, the number of values which are 
to be considered in Wxujb may be bounded by the size of T. This yields an 
algorithm of polynomial complexity in order to decide the sequentiality of the 
function realized by T. 

In [1], it is shown directly that the twinning property is decidable in polyno- 
mial time. 
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Abstract. In this paper, we introduce a special class of Biichi automata 
called unambiguous. In these automata, any infinite word labels exactly 
one path going infinitely often through hnal states. The word is accepted 
by the automaton if this path starts at an initial state. The main result of 
the paper is that any rational set of infinite words is recognized by such 
an automaton. We also provide two characterizations of these automata. 
We finally show that they are well suitable for boolean operations. 



1 Introduction 

Automata on infinite words have been introduced by Biichi [3] in order to prove 
the decidability of the monadic second-order logic of the integers. Since then, 
automata on infinite objects have often been used to prove the decidability of 
numerous problems. From a more practical point of view, they also lead to 
efficient decision procedures as for temporal logic [12]. Therefore, automata of 
infinite words or infinite trees are one of the most important ingredients in model 
checking tools [14]. The complementation of automata is then an important issue 
since the systems are usually modeled by logical formulas which involve the 
negation operator. 

There are several kinds of automata that recognize sets of infinite words. 
In 1962, Biichi [3] introduced automata on w-words, now referred to as Biichi 
automata. These automata have initial and final states and a path is successful if 
it starts at an initial state and goes infinitely often through final states. However, 
not all rational sets of infinite words are recognized by a deterministic Biichi 
automaton [5]. Therefore, complementation is a rather difficult operation on 
Biichi automata [12]. 

In 1963, Muller [9] introduced automata, now referred to as Muller automata, 
whose accepting condition is a family of accepting subsets of states. A path is 
then successful if it starts at the unique initial state and if the set of states which 
occurs infinitely in the path is accepting. A deep result of McNaughton [6] shows 
that any rational set of infinite words is recognized by a deterministic Muller 
automaton. A deterministic automaton is unambiguous in the following sense. 
With each word is associated a canonical path which is the unique path starting 
at the initial state. A word is then accepted iff its canonical path is successful. 
In a deterministic Muller automaton, the unambiguity is due to the uniqueness 
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of the initial state and to the determinism of the transitions. Independently, 
the acceptance condition determines if a path is successful or not. The unam- 
biguity of a deterministic Muller automaton makes it easy to complement. It 
suffices to exchange accepting and non-accepting subsets of states. However, the 
main drawback of using deterministic Muller automata is that the acceptance 
condition is much more complicated. It is a family of subsets of states instead 
of a simple set of final states. There are other kinds of deterministic automata 
recognizing all rational sets of infinite words like Rabin automata [11], Street au- 
tomata or parity automata [8]. In all these automata, the acceptance condition 
is more complicated than a simple set of final states. 

In this paper, we introduce a class of Biichi automata in which any infi- 
nite word labels exactly one path going infinitely often through final states. 
A canonical path can then be associated with each infinite word and we call 
these automata unambiguous. In these automata, the unambiguity is due to 
the transitions and to the final states whereas the initial states determine if a 
path is successful. An infinite word is then accepted iff its canonical path starts 
at an initial state. The main result is that any rational set of infinite words is 
recognized by such an automaton. It turns out that these unambiguous Biichi 
automata are codeterministic, i.e., reverse deterministic. Our result is thus the 
counterpart of McNaughton’s result for codeterministic automata. It has already 
been proved independently in [7] and [2] that any rational set of infinite words 
is recognized by a codeterministic automaton but the construction given in [2] 
does not provide unambiguous automata. We also show that unambiguous au- 
tomata are well suited for boolean operations and especially complementation. 
In particular, our construction can be used to find a Biichi automaton which 
recognizes the complement of the set recognized by another Biichi automaton. 
For a Biichi automaton with n states, our construction provides an unambiguous 
automaton which has at most (12n)" states. 

The unambiguous automata introduced in the paper recognize right-infinite 
words. However, the construction can be adapted to bi-infinite words. Two un- 
ambiguous automata on infinite words can be joined to make an unambiguous 
automaton on bi-infinite words. This leads to an extension of McNaughton’s 
result to the realm of bi-infinite words. 

The main result of this paper has been first obtained by the second author 
and his proof has circulated as a hand-written manuscript among a bunch of 
people. It was however never published. Later, the first author found a different 
proof of the same result based on algebraic constructions on semigroups. Both 
authors have decided to publish their whole work on this subject together. 

The paper is organized as follows. Section 2 is devoted to basic definitions on 
words and automata. Unambiguous Biichi automata are defined in Sect. 3. The 
main result (Theorem 1) is stated there. The first properties of these automata 
are presented in Sect. 4. Boolean Operations are studied in Sect. 5. 
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2 Automata 

We recall here some elements of the theory of rational sets of finite and infinite 
words. For further details on automata and rational sets of finite words, see [10] 
and for background on automata and rational sets of infinite words, see [13]. Let 
A be a set called an alphabet and usually assumed to be finite. We respectively 
denote by A* and A+ the set of finite words and the set of nonempty finite 
words. The set of right-infinite words, also called w- words, is denoted by 

A Biichi automaton A = {Q, A, E, I , F) is a non-deterministic automaton 
with a set Q of states, subsets I,F C Q of initial and final states and a set 
EcQxAxQ of transitions. A transition (p, a, q) of A is denoted by p — >■ g. 
A path in A is an infinite sequence 



ao 0-1 

7 : go — > qi — > Q2--- 

of consecutive transitions. The starting state of the path is Qq and the w-word 
A( 7 ) = ooai ... is called the label of 7 . A final path is a path 7 such that at least 
one of the final states of the automaton is infinitely repeated in 7 . A successful 
path is a final path which starts at an initial state. 

As usual, an w-word is accepted by the automaton if it is the label of a 
successful path. The set of accepted w-words is said to be recognized by the 
automaton and is denoted by L(A). It is well known that a set of w- words is 
rational iff it is recognized by some automaton. 

A state of a Biichi automaton A is said to be coaccessible if it is the starting 
state of a final path. A Biichi automaton is said to be trim if all states are 
coaccessible. Any state which occurs in a final path is coaccessible and thus non- 
coaccessible states of an automaton can be removed. In the sequel, automata are 
usually assumed to be trim. 

An automaton A = {Q, A, E, I , F) is said to be codeterministic if for any 
state q and any letter a, there is at most one incoming transition p q for some 
state p. If this condition is met, for any state q and any finite word w, there is 
at most one path p — > q ending in q. 

3 Unambiguous Automata 

In this section, we introduce the concept of unambiguous Biichi automata. We 
first give the definition and we state one basic property of these automata. We 
then establish a characterization of these automata. We give some examples and 
we state the main result. 

Definition 1. A Biichi automaton A is said to be unambiguous (respectively 
complete^ iff any uj-word labels at most (respectively at least) one final path 
in A. 

The set of final paths is only determined by the transitions and the final 
states of A. Thus, the property of being unambiguous or complete does not 
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depend on the set of initial states of A. In the sequel, we will freely say that 
an automaton A is unambiguous or complete without specifying its set of initial 
states. 

The definition of the word “complete” we use here is not the usual definition 
given in the literature. A deterministic automaton is usually said to be complete 
if for any state q and letter a there is at least an outgoing transition labeled 
by a. This definition implies that any finite or infinite word labels at least a 
path starting at the initial state. It will stated in Proposition 1 that the unambi- 
guity implies that the automaton is codeterministic. Thus, we should reverse the 
definition and we should say that for any state q and letter a there is at least an 
incoming transition labeled by a. However, since the words are right-infinite, this 
condition does not imply anymore that any w-word labels a path going infinitely 
often though final states as it is shown in Example 3. Thus, the definition chosen 
in this paper really insures that any w-word is the label of a final path. It will 
be stated in Proposition 1 that this condition is actually stronger that the usual 
one. 

In the sequel, we write UBA for Unambiguous Biichi Automaton and CUBA 
for Complete Unambiguous Biichi Automaton. The following example is the 
simplest CUBA. 

Example 1. The automaton ({0}, A, A, /, {0}) with if = {0 0 | a G A} is 

obviously a CUBA. It recognizes the set A“ of all w-words if the state 0 is initial 
and recognizes the empty set otherwise. It is called the trivial CUBA. 

The following proposition states that an UBA must be codeterministic. Such 
an automaton can be seen as a deterministic automaton which reads infinite 
words from right to left. It starts at infinity and ends at the beginning of the 
word. Codeterministic automata on infinite words have already been considered 
in [2] . It is proved in that paper that any rational set of w- words is recognized by 
a codeterministic automata. Our main theorem generalizes this results. It states 
that any rational set of w- words is recognized by a CUBA. 

Proposition 1. Let A = (Q,A,E,I,F) be a trim Biichi automaton. If A is 
unambiguous, then A is codeterministic. If A is complete, then for any state q 
and any letter a, there is at least one incoming transition p q for some state p. 

The second statement of the proposition says that our definition of complete- 
ness implies the usual one. Example 3 shows that the converse does not hold. 
However, Proposition 3 provides some additional condition on the automaton to 
ensure that it is unambiguous and complete. 

Before giving some other examples of CUBA, we provide a simple characteri- 
zation of CUBA which makes it easy to verify that an automaton is unambiguous 
and complete. This proposition also shows that it can be effectively checked if a 
given automaton is unambiguous or complete. 

Let A = {Q, A, E, I, F) be a Biichi automaton and let g be a state of A. We 
denote by Aq = {Q,A,E,{q},F) the new automaton obtained by taking the 
singleton {g} as set of initial states. The set I{Aq) is then the set of w-words 
labeling a final path starting at state q. 
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Proposition 2. Let A = {Q,A,E,I,F) be a Biichi automaton. For q G Q, 
let Aq the automaton (Q,A,E,{q},F). The automaton A is unambiguous iff 
the sets L{Aq) are pairwise disjoint. The automaton A is complete iff C 

UgeQ ^(“^9) 

In particular, the automaton A is unambiguous and complete iff the family 
of sets T{Aq) for g G Q is a partition of It can be effectively verified that 
the two sets recognized by the automata Aq and Aq' are disjoint for q yf q'. It 
can then be checked if the automaton is unambiguous. Furthermore, this test 
can be performed in polynomial time. The set UqeQ ^(“^ 9 ) recognized by 
the automaton Aq = {Q, A, E,Q, F) whose all states are initial. The inclusion 
4l“ C {}q(zQ^{Aq) holds iff this automaton recognizes A^ . This can be checked 
but it does not seem it can be performed in polynomial time. 

We now come to examples. We use Proposition 2 to verify that the following 
two automata are unambiguous and complete. In the figures, a transition p — 1 g 
of an automaton is represented by an arrow labeled by a from p to q. Initial states 
have a small incoming arrow while final states are marked by a double circle. A 
Biichi automaton which is complete but ambiguous is given is Example 4. 




b 

Fig. 1. CUBA of Example 2 



Example 2. Let A be the alphabet A = {a, b} and let A be the automaton 
pictured in Fig. 1. This automaton is unambiguous and complete since we have 
L(^o) = o,A^ and T{Ai) = bA^ . It recognizes the set of w- words beginning 

with an a. 




Fig. 2. CUBA of Example 3 



The following example shows that a CUBA may have several connected com- 
ponents. 
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Example 3. Let A be the alphabet A = {a, b} and let A be the automaton 
pictured in Fig. 2. It is unambiguous and complete since we have L(^o) = A*ba^, 
= a“, L(^ 2 ) = a{A*b)‘^ and lj{A-i) = b{A*b)^. It recognizes the set {A*b)‘^ 
of w-words having an infinite number of b. 

The automaton of the previous example has two connected components. Since 
it is unambiguous and complete any w-word labels exactly one final path in this 
automaton. This final path is in the first component if the w-word has finitely 
many b and it is the second component otherwise. This automaton shows that 
our definition of completeness for an unambiguous Biichi automaton is stronger 
than the usual one. Any connected component is complete in the usual sense 
if it is considered as a whole automaton. For any letter a and any state q in 
this component, there is exactly one incoming transition p ^ q. However, each 
component is not complete according to our definition since not any w-word 
labels a final path in this component. 

In the realm of finite words, an automaton is usually made unambiguous by 
the usual subsets construction [4, p. 22]. This construction associates with an 
automaton A an equivalent deterministic automaton whose states are subsets of 
states of A. Since left and right are symmetric for finite words, this construction 
can be reversed to get a codeterministic automaton which is also equivalent 
to A. In the case of infinite words, the result of McNaughton [6] states that a 
Biichi automaton can be replaced by an equivalent Muller automaton which is 
deterministic. However, this construction cannot be reversed since w-words are 
right-infinite. We have seen in Proposition 1 that a CUBA is codeterministic. 
The following theorem is the main result of the paper. It states that any rational 
set of w-words is recognized by a CUBA. This theorem is thus the counterpart 
of McNaughton’s result for codeterministic automata. Like Muller automata, 
CUBA make the complementation very easy to do. This will be shown in Sect. 5. 
The proof of Theorem 1 contains a new proof that the class of rational sets of 
w-words is closed under complementation. 

Theorem 1. Any rational set of co -words is recognized by a complete unambigu- 
ous Biichi automaton. 

There are two proofs of this result which are both rather long. Both proofs 
yield effective procedures which give a CUBA recognizing a given set of w- words. 
The first proof is based on graphs and it directly constructs a CUBA from a 
Biichi automaton recognizing the set. The second proof is based on semigroups 
and it constructs a CUBA from a morphism from A'^ into a finite semigroup 
recognizing the set. An important ingredient of both proofs is the notion of a 
generalized Biichi automaton. 

In a Biichi automaton, the set of final paths is the set of paths which go 
infinitely often through final states. In a generalized Biichi automaton, the set of 
final paths is given in a different way. A generalized Biichi automaton is equipped 
with an output function /i which maps any transition to a nonempty word over 
an alphabet B and with a fixed set K of w- words over B. A path is final if the 
concatenation of the outputs of its transitions belongs to K. A generalized Biichi 
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automaton can be seen as an automaton with an output function. We point out 
that usual Biichi automata are a particular case of generalized Biichi automata. 
Indeed, if the function /i maps any transition p q to 1 if p or q is final and to 0 
otherwise and if K = (0*1)‘^ is the set of w-words over {0, 1} having an infinite 
number of 1, a path in A is final if some final state occurs infinitely often in it. 

The notions of unambiguity and completeness are then extended to general- 
ized Biichi automata. A generalized Biichi automaton is said to be unambiguous 
(respectively complete) if any w-word labels at most (respectively at least) one 
final path. 

The generalized Biichi automata can be composed. If a set X is recognized 
by an automaton A whose fixed set K is recognized by automaton B which has a 
fixed set K' , then X is also recognized by an automaton having the fixed set K' 
which can be easily constructed from A and B. Furthermore, this composition 
is compatible with unambiguity and completeness. This means that if both au- 
tomata A and B are unambiguous (respectively complete), then the automaton 
obtained by composition is also unambiguous (respectively complete). 

4 Properties and Characterizations 

In this section, we present some additional properties of CUBA. We first give an- 
other characterization of CUBA which involves loops going through final states. 
We present some consequences of this characterization. The characterization of 
CUBA given in Proposition 2 uses sets of w-words. The family of sets of w-words 
labeling a final path starting in the different states must be a partition of the 
set of all w-words. The following proposition only uses sets of finite words to 
characterize UBA and CUBA. 

Proposition 3. Let A = (Q, A, E, I, B) he a Biichi automaton such that for 
any state q and any letter a, there exists exactly one incoming transition p q. 
Let Sq be the set of nonempty finite words w such that there is a path q q 
going through a final state. The automaton A is unambiguous iff the sets Sq are 
pairwise disjoint. The automaton A is unambiguous and complete iff the family 
of sets Sq for q G Q is a partition o/ A+. Ln this case, the final path labeled by 
the periodic ui-word w‘^ is the path 

W W 

q-G q-G q--- 

where q is the unique state such that w G Sq. 

The second statement of the proposition says that if that if the automaton A 
is supposed to be unambiguous, it is complete iff the inclusion A+ C UgeQ 
holds. The assumption that the automaton is unambiguous is necessary. As the 
following example shows, it is not true in general that the automaton is complete 
iff the inclusion holds. 

Example 4- The automaton of Fig. 3 is ambiguous since the w-word 6“ labels 
two final paths. Since this automaton is deterministic and all states are final, it 
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a 



Fig. 3. CUBA of Example 4 



is complete. However, it is not true that A+ C UgeQ I^ideed, no loop in this 
automaton is labeled by the finite word a. 

Proposition 3 gives another method to check if a given Biichi automaton is 
unambiguous and complete. It must be first verified that for any state q and 
any letter a, there is exactly one incoming transition p ^ q. Then, it must be 
checked if the family of sets Sq for q £ Q forms a partition of H+. The sets Sq are 
rational and a codeterministic automaton recognizing Sq can be easily deduced 
from the automaton A. It is then straightforward to verify that the sets Sq form 
a partition of H+. 

The last statement of Proposition 3 says that the final path labeled by a 
periodic word is also periodic. It is worth mentioning that the same result does 
not hold for deterministic automata. 

If follows from Proposition 3 that the trivial CUBA with one state (see 
Example 1) is the only CUBA which is deterministic. 



5 Boolean Combinations 

In this section, we show that CUBA have a fine behavior with the boolean op- 
erations. From CUBA recognizing two sets X and Y, CUBA recognizing the 
complement \ X, the union X UY and the intersection X C\Y can be easily 
obtained. For usual Biichi automata or for Muller automata, automata recog- 
nizing the union and the intersection are easy to get. It is sufficient to consider 
the product of the two automata with some small additional memory. However, 
complementation is very difficult for general Biichi automata. 



5.1 Complement 

We begin with complementation which turns out to be a very easy operation 
for CUBA. Indeed, it suffices to change the initial states of the automaton to 
recognize the complement. 

Proposition 4. Let A = (Q,A,E,I,F) be a CUBA recognizing a set X of to- 
words. The automaton A' = {Q,A,E,Q \ I,F) where Q \ I is the set of non 
initial states, is unambiguous and complete and it recognizes the complement 
A‘^\X of X. 
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It must be pointed out that it is really necessary for the automaton A to be 
unambiguous and complete. Indeed, if A is ambiguous, it may happen that an 
w-word X of X labels a final path starting at an initial state and another final 
path starting at a non initial state. In this case, the w-word x is also recognized 
by the automaton A'. If -4 is not complete, some w-word x labels no final path. 
This w-word which does not belong to X is not recognized by the automaton A' . 

By the previous result, the proof of Theorem 1 also provides a new proof of the 
fact that the family of rational sets of w- words is closed under complementation. 

5.2 Union and Intersection 

In this section, we show how CUBA recognizing the union X\ U X2 and the 
intersection Xi fl X2 can be obtained from CUBA recognizing Xi and X2- 

We suppose that the sets X\ and X2 are respectively recognized by the CUBA 
= {Qi,A,EiAi,Fi) and A2 = (Q2i A, E2, h, F2). We will construct two 
CUBA U = {Q, A, E, lu, F) and I = {Q, A, E, Ix, F) respectively recognizing 
the union Xi U X2 and the intersection Xi fl X2- Both automata U and X share 
the same states set Q, the same transitions set E and the same set E of final 
states. 

We first describe the states and the transitions of both automata U and X. 
These automata are based on the product of the automata . 4 i and A2 but a 
third component is added. The final states may not appear at the same time in 
Ai and A2- The third component synchronizes the two automata by indicating 
in which of the two automata comes the first final state. The set Q of states is 
Q = <5i xQ2x{ 1, 2}. Each state is then a triple {qi,q 2 , ff) where qi is a state of .4i, 
q2 is a state of A2 and £ is 1 or 2 . There is a transition {q[,q2, e') A {qi,q2, s) if 
q'l A qi and q '2 A (72 are transitions of .4i and .42 and if e' is defined as follows. 

f 1 if (ji G Fl 

e' = < 2 if (ji ^ Fl and q2 G F2 
[ £ otherwise 

This definition is not completely symmetric. When both qi and (72 are final 
states, we choose to set s' = 1 . We now define the set F of final states as 

F = {{qi,q2,e)\q2 G F2 and £ = l}. 

This definition is also non symmetric. 

It may be easily verified that any loop around a final state (qi ,q2,s) also 
contains a state {q'i,q'2,e') such that q'2 G F2. This implies that the function 
which maps a path 7 to the pair (71,72) of paths in . 4 i and .42 is one to one 
from the set of final paths in U or X to the set of pairs of final paths in . 4 i 
and .42. Thus if both . 4 i and .42 are unambiguous and complete, then both 
automata 14 and X are also unambiguous and complete. 

If qi and (72 are the respective starting states of 71 and 72, the starting state 
of 7 is then equal to (qi,q2,s) with £ G { 1 , 2 }. We thus define the sets lu and Ix 
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of initial states of the automata U and I as follows. 

lu = {(9i,<?2,£)|(gi G h or q 2 G h) and e G {1,2}} 
h = G h and q 2 G h and £ G {1,2}} 

From these definitions, it is clear that both automata U and X are unambiguous 
and complete and that they respectively recognize Xi U X2 and Xi 0X2- 
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Abstract. It is well-known that for classical one-dimensional one-way 
CA (OCA) it is possible to speed up language recognition times from 
(1 -I- r)n, r € R+, to (1 + rl2)n. In this paper we show that this no 
longer holds for OCA in which a cell can comminucate only one bit (or 
more generally a fixed amount) of information to its neighbor in each 
step. For arbitrary real numbers r 2 > ri > 1 in time r 2 n 1-bit OCA can 
recognize strictly more languages than those operating in time rin. Thus 
recognition times may increase by an arbitrarily large constant factor 
when restricting the communication to 1 bit. For two-way CA there is 
also an infinite hierarchy but it is not known whether it is as dense as 
for OCA. Furthermore it is shown that for communication restricted 
CA two-way flow of information can be much more powerful than an 
arbitrary number of additional communication bits. 



1 Introduction 

The model of 1-bit CA results from the standard definition by restricting the 
amount of information which can be transmitted by a cell to its neighbors in 
one step to be only 1 bit. We call this the communication bandwidth. 

Probably the first paper investigating 1-bit CA is the technical report by [2] 
where it is shown that even with this model solutions of the FSSP in optimal time 
are possible. More recently [4] has described 1-bit CA for several one- and two- 
dimensional problems (e.g. generation of Fibonacci sequences and determining 
whether two-dimensional patterns are connected) which again are running in the 
minimum time possible. Therefore immediately the questions arises about the 
consequences of the restriction to 1-bit information flow in the general case. 

In Section 2 basic definitions are given and it is proved that each CA with s 
states can be simulated by a 1-bit CA with a slowdown by a factor of at most 
[log s] . This seems to be some kind of folklore, but we include the proof for the 
sake of completeness and reference in later sections. 

In Section 3 it is shown that for one-way CA (OCA) in general there must be 
a slowdown. More specifically there is a very fine hierarchy with an uncountable 
number of distinct levels (order isomorphic to the real numbers greater than 1) 
within the class of languages which can be recognized by 1-bit OCA in linear 
time. 

In Section 4 we consider two-way CA with restricted communication. 
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The results obtained are in contrast to those for cellular devices with un- 
restricted information flow. For example, general speedup theorems have been 
shown for iterative arrays [1] and cellular automata [3]. 

2 Simulation of k-bit CA by 1-bit CA 

A deterministic CA is determined by a finite set of states Q, a neighborhood 
N' = N U {0} and a local rule. (For a simpler notation below, we assume that 
N = {m, . . . ,n|jv|} does not contain {0}). The local rule of C is of the form 
T : Q X — >■ Q, i.e. each cell has the full information on the states of its 
neighbors. 

In a fc-bit CA B, each cell only gets k bits of information about the state 
of each neighbor. To this end there are functions bi : Q ^ specified, where 
B = {0,1}. If a cell is in state q then hi{q) are the bits observed by neighbor m. 
We allow different bits to be seen by different neighbors. The local transformation 
of B is of the form t : Q x (B^)'^ — >■ Q. 

Given a configuration c : Z — >■ Q and its successor configuration c' the new 
state of a cell i is c' = t(c*, 6i(cj+„J, . . . , 6|Ar|(ci+„|^i). 

As usual, for the recognition of formal languages over an input alphabet A 
one chooses Q D A and a set of accepting final states F C Q \ A. In the initial 
configuration for an input xi ■ ■ ■ x„ € A" cell i is in state Xi for 1 < i < n and all 
other cells are in a quiescent state q (satisfying r{q, q^) = q). A configuration c 
is accepting iff ci G F. 

Given a /c-bit CA C one can construct a 1-bit CA C with the same neigh- 
borhood simulating C in the following sense: Each configuration c of C is also 
a legal configuration of C", and there is a constant I (independent of c) such 
that if c' is C”s successor configuration of c then C" when starting in c reaches 
c' after I steps. The basic idea is to choose representations of states by binary 
words and to transmit them bit by bit to the neighbors before doing a “real” 
state transition. 

Let B-* denote B° U • • • U BL Denote by bij{q) the j-th bit of bi{q), i.e. 
bii^) = bi,k{q) ■ ■ ■ h,i{q). 

Algorithm 1. As the set of states of C choose Q' = Q x (B^)-*“^; i.e. each 
state q' consists of a, q € Q and binary words vi, . . . , V|jv| of identical length j 
for some 0 < j < fc — 1. For each q £ Q identify ((?,£,..., s) with q so that Q can 
be considered a subset of Q' . (Here, e is the empty word.) For j < k — 1 and a 
q' = {q,vi, . . . ,U|jv|) £ Q X define 6((g') = bij+i{q), where the 6' are the 

functions describing the bit seen by neighbor rn in C' . 

The local transformation r' of C is defined as follows: 

— If the length j if all Ve is < k — 1 then T'{{q, ui, . . . , U|Ar|), xi, . . . , X|Ar|) = 

{q.xiv-i , . . .,x^m\v\n\)- 

— If the length j if all Ve is = k — 1 then T'{{q, ui, . . . , U|jv|), , a^|Af|) = 

{T{q,xivi, . . . ,x\m\v\n\),£, ■ ■ ■ 
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The above construction shows that the following lemma holds: 

Lemma 2. A k-bit CA can he simulated by a 1-bit CA with the same neighbor- 
hood and slowdown k. 

Since the states of a set Q can be unambiguously represented as binary words 
of length [log 2 IQI], it is straightforward to see: 

Corollary 3. Each CA with s states can be simulated by a 1-bit CA with slow- 
down [log 2 s] having the same neighborhood and identical functions bi for all 
neighbors. 

It should be observed that the above slowdown happens if the bit visible to other 
cells is the same for all neighbors. One could wonder whether the slowdown is 
always less if different bits are sent to different neighbors. However this is not 
the case. The proofs below for the lower bounds do not specifically make any 
use of the fact that all neighbors are observing the same bit; they work even if 
there were |A^| (possibly different) functions bi for the neighboring cells. 

On the other hand one should note that for certain CA there is a possibility 
for improvement, i.e. conversion to 1-bit CA with a smaller slowdown: Some- 
times it is already known that neighbors do not need the full information about 
each state. In a typical case the set of states might be the Cartesian product 
of some sets and a neighbor only needs to know one component, as it is by 
definition the case in so-called partitioned CA. It is then possible to apply a 
similar construction as above, but only to that component. Since the latter can 
be described with less bits than the whole state, the construction results in a 
smaller slowdown. 

We will make use of this and a related trick in Section 3.3. 

3 A Linear-Time Hierarchy for 1-bit OCA 

For a function / : N+ — N+ denote by OCAfc(/(n)) the family of languages 
which can be recognized by fc-bit OCA in time /(n). In this section we will prove: 

Theorem 4. For all real numbers 1 < r± < r 2 holds: 

OCAi(rin) ^ OCAi(r 2 n) 

We will proceed in 3 major steps. 



3.1 An Infinite Hierarchy 

Let Am be an input alphabet with exactly m = 2* — 1 symbols. Hence I bits are 
needed to describe one symbol of Am U {□}, where □ is the quiescent state. The 
case of alphabets with an arbitrary number of symbols will be considered later. 

Denote by Lm the set {vv^ \ v G A+ } of all palindromes of even length over 
Am- 
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Lemma 5. Each 1-bit OCA recognizing Lm needs at least time {l — e)n for every 
£ > 0 . 

Proof. Consider a 1-bit OCA C recognizing Lm and an input length n. Denote 
by t the worst case computation time needed by C for inputs of length n. 

Consider the boundary between cells k and A: -1-1, 1 < A: < n, which separates a 
left and a right part of an input. The computations in the left part are completely 
determined by the corresponding part of the input and the sequence Br of bits 
received by cell k from the right during time steps 1, . . . ,t— k. There are exactly 
2t-fc bit sequences. On the other hand there are = (2* — 1)0-^) 

right parts of inputs of length n. 

Assume that 2*“^ < (2* — 1)("~^). Then there would exist two different 
words vi and V 2 of length n — k resulting in the same bit string received by cell 
k during any computation for an input of one of the forms vv\ or vv 2 - Since we 
are considering OCA, the bit string is independent of any symbols to the left of 
cell A; -I- 1. Therefore C would either accept or reject both inputs v\Vi or V\V 2 , 
although exactly one of them is in Lm- Contradiction. 

Therefore 2*“^ > (2* — 1)("~^). For sufficiently large n there is an arbitrarily 
small e” such that this implies 2*“^ > 2^*“® i.e. t — k > {I — s"){n — k), 

i.e. t > ln-\-k — lk — e"n-\-e"k. For an arbitrarily chosen e' > 0 consider the case 
k = e'n (for sufficiently large n). One then gets t > ln-\-e'n—le'n — e"n-\-e''e'n = 
In — {e'{l — 1 — e”) e")n. If e' and e" are chosen sufficiently small this is larger 

than In — en for a given e. 

Lemma 6. Lm can be recognized by 1-bit OCA in time (/ -I- l)n -I- 0(1). 

Proof. Algorithm 7 below describes a (/ -I- l)-bit OCA recognizing the language 
in time n-\- 0(1). Hence the claim follows from Lemma 2. 

Algorithm 7. We describe & {l-\- l)-bit OCA recognizing Lm. The set of states 
can be chosen to be of the form Qm = Am U Am x {Am U {□}) x B^. The local 
rule mapping the state qc of a cell and the state qr of its neighbor to T{qc, qr) is 
chosen as follows. For the first step, with a, b' G Am- 

T{a,b') = (a,b',l,l) 

For later steps: 

T{{a,a',x,x'),{b,b',y,y')) = {a,b',x' A [a = a%y) 

where [a = a'] is 1 if the symbols are equal and 0 otherwise. As can be seen 
immediately, the only information needed from the right neighbor is one symbol 
b' and one bit y. Hence an {I -\- l)-bit OCA can do the job. 

A closer look at the local rule reveals that the OCA above indeed recognizes 
palindromes in time n-l-O(l) if one chooses as the set of final states Am x {□} x 
{1} X B (see [5] for details). Hence Lm can also be recognized by 1-bit OCA in 
time {I -\- l)n -\- 0(1). 

The upper bound of the previous lemma is not very close to the lower bound of 
Lemma 5, and it is not obvious how to improve at least one of them. 
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3.2 Reducing the Gap to (1 ih e)n 

We will now define variants of the palindrome language for which gaps between 
upper and lower bound can be proved to be very small. 

We will use vectors of length r of symbols from an alphabet A as symbols 
of a new alphabet A'. Although a vector of symbols is more or less the same 
as a word of symbols, we will use different notations for both concepts in order 
to make the construction a little bit clearer. Denote by the set of vectors 
of length r of elements from a set M and by the set of words of length r 
consisting of symbols from A. The obvious mapping {x\,...,Xr) >— >■ Xi---Xr 
induces a monoid homomorphism h : — >■ (A'’)* C A*. 

Definition 8. For integers m > 1 and r >1 let 

Lm,r = {vh{v)^ I V G (A^>)+} 

(r) 

Lm,r is a language over the alphabet Ain U A^. The words in Lm,r are still more 
or less palindromes where in the left part of a word groups of r elements from 
Am are considered as one (vector) symbol. As a special case one has Lm,i = Tm 
as defined earlier. 

Lemma 9. For each e > 0 there is an r > 1 such that each 1-bit OCA recog- 
nizing Lm,r needs at least time {I — e)n . 

A proof can be given analogously to the proof of Lemma 5 above. One only has 
to observe that the border between cells k and k-\-l must not lie within “the left 
part ri” of an input. Therefore for small e one must choose a sufficiently large r, 
e.g. r > 1/e, to make sure that Iril < e\vh{v)^\. 

Thus for sufficiently large r although [log 2 |Am|] ’ ^ is not a lower bound on 
the recognition time of Lm,r by 1-bit OCA, it is “almost”. 

Lemma 10. For each e > 0 and r = 1/e the language Lm,r can be recognized 
by a 1-bit OCA in time {I -L e)n -f 0(1) . 

Thus for sufficiently large r although |"log 2 \Am[\ ■ n is not an upper bound on 
the on the achievable recognition time on 1-bit OCA, it is “almost” . 

For the proof we use a construction similar to Algorithm 7. 

Algorithm 11. The CA uses a few additional steps before and after the check 
for palindromes, where the check itself also has to be adapted to the different 
form of inputs. 

~ In the first step each cell sends one bit to its left neighbor indicating whether 
its input symbol is from Am or Am ■ Thus, if the input is not in (Am^)*A^ 
this is detected by at least one cell and an error indicator is stored locally. 
It will be used later. 

— One may therefore assume now that the input is of the indicated form, and 
we will call cells with an input symbol from Am the “right” cells and those 

(r) 

with a symbol from the “left” cells. 

After the first step the rightmost of the left cells has indentified itself. 
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~ With the second step an algorithm for palindrome checking is started. The 
modifications with respect to Algorithm 7 are as follows: 

— Each cell is counting modulo Ir + \ in each step. (This doesn’t require 
any communication.) 

— During the first Ir steps of a cycle the right cells are shifting r symbols 
to the left. In the {Ir + l)-st step they do not do anything. 

— During the first Ir steps of a cycle the left cells are also shifting r symbols 
to the left. In addition they are accumulating what they receive in reg- 
isters. In step Ir step they are comparing whether the register contents 
“match” their own input symbol, and in step /r-|- 1 they are sending the 
result of the comparison, combined with the previously received compar- 
ison bit to their left neighbor. 

One should observe that the last point is the basic trick: the comparison bit 
has not to be transported one cell to the left each time a symbol has been 
received, but only every r symbols. Thus by increasing r the fraction of time 
needed for transmitting these bits can be made arbitrarily small. 

— All the algorithms previously described have the following property: The part 
of the time space diagram containing all informations which are needed for 
the decision whether to accept or reject an input has the form of a triangle. 
Its longest line is a diagonal with some slope n/t{n) (or t{n)ln depending on 
how you look at it) leading from the rightmost input cell the leftmost one. 
Furthermore every cell can know when it has done its job because afterwards 
it only receives the encodings of the quiescent state. 

~ Therefore the following signal can be implemented easily: It starts at the 
rightmost input cell and collects the results of the checks done in the very 
first step. It is moved to the left immediately after a cell has transmitted 
at least one (encoding of the) quiescent state in a {Ir + l)-cycle. Thus this 
signal causes only one additional step to the overall recognition time. 

Since the above algorithm needs Ir + 1 steps per r input symbols from Am and 
since the rightmost r symbols have to travel approximately n cells far, the total 
running time is n ■ {Ir + l)/r + 0(1), i.e. (/ -I- l/r)n -I- 0(1) as required. 

^From the Lemmata 9 and 10 one can immediately deduce the following: 

Corollary 12. For each integer constant c the set of languages which can he 
recognized by 1-bit OCA in time cn is strictly included in the the set of languages 
which can be recognized by 1-bit OCA in time {c-\-2)n. 

This has to be contrasted with unlimited OCA where there is no such infinite 
hierarchy within the family of languages which can recognized in linear time. 
One therefore gets the situation depicted in Figure 1. 

In the top row one uses the fact that for each i > 1 

0CAi(2m) C 0CAi((2i -h 1 - e)n) C 0CAi((2t -h 1 -k s)n) C 0CAi((2i -k 2)n) 
and for each column one has to observe that 

0CAi(2m) C 0CAi((2t -k 2)n) C 0CA((2i -k 2)n) = 0CA(2m) . 
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OCAi(2n) C OCAi(4n) g OCAi(6n) g OCAi(8n) g . . . 
OCA(2n) = OCA(4n) = OCA(6n) = OCA(8n) = • • • 

Fig. 1. A hierarchy for 1-bit OCA. 



3.3 There Are Small Gaps Everywhere 

Finally we will prove now that a small increase of the linear-time complexity 
already leads to an increased recognition power not only around (r ± e)n for 
natural numbers r, but for all real numbers r > 1. Since the rational numbers 
are dense in R it suffices to prove the result for r G Q. 

The basic idea is the following: The number I playing an important role in 
the previous sections is something like an “average number of bits needed per 
symbol” . What we want to achieve below is an average number r of bits needed 
per symbol. 

Assume that an arbirtrary rational number r > 1 has been fixed as well as 
the relatively prime natural numbers x and y < x such that r = x/y. Then 
the above is more or less equivalent to saying that one needs x bits for every y 
symbols. 

Therefore choose the smallest m such that 2^ < and a set M of 2^ — 1 
different words from A^. Then extend the alphabet and “mark” the first and 
last symbols of these words. These markings will only be used in Algorithm 15. 
For the sake of simplicity will ignore them in the following descriptions. In order 
to define the languages L'^ ^. to be used later we start with the languages Lm',r 
considered in the previous section, where m' = 2^ — 1. Denote by a one- 
to-one mapping g^ y : A^' — >■ M which is extended vectors of length r by 
considering it as a function mapping each r-tuple of symbols from Am' to word 
of length y of r-tuples of symbols from Am and extending this further to a 
monoid homomorphism in the obvious way. Now choose 

^x,y,m,r — gx,y{,Lm' ,r^ 

Lemma 13. For each e > 0 there is an r > 1 such that each l-bit OCA recog- 
nizing Lx,y^m,r needs at least time {x/y — e)n. 

It is a routine exercise to adapt the proof of Lemma 9 to the new situation. 

Lemma 14. For each £ > 0 and r = 1/s the language L^ y m,r can be recognized 
by a 1-bit OCA in time {x/y -\- s)n -\- 0{1). 

Algorithm 15. Basically the same idea as in Algorithm 11 can be used. Two 
modifications are necessary. 

The first one is a constant number of steps which have to be carried out in 
the very beginning. During these steps each cell collects the information about 
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the y —1 input symbols to its right, so that it knows which of the words w € M 
is to its right, assuming that it is the left end of one of it. From then on these 
marked cells play the role of all cells from Algorithm 11. 

The second modification is an additional signal of appropriate speed which 
is sent from the right end of the input word. It checks that all the left end and 
right end of word markings are indeed distributed equidistantly over the whole 
input word. If this is not the case the input is rejected. 

As a consequence there is an uncountable set of families of languages ordered 
by proper inclusion which is order isomorphic to the real numbers greater than 
1 as already claimed at the beginning of this section: 

Proof (of Theorem 4)- Choose a rational number x/y and an £ > 0 such that 
ri < x/y — e < x/y + e < r 2 - From Lemmata 13 and 14 follows that there is a 
language in OCAi(a:/y + £) \ OCAi(a;/j/ — £) which is then also a witness for 
the properness of the above inclusion. 

Therefore the hierarchy depicted in Figure 1 can be generalized to the following, 
where ri and r 2 are arbitrary real numbers satisfying 1 < ri < r 2 : 



c OCAi(nn) C 
c 



C OCAi(r2n) C ... 
C 



• • • = OCA(nn) = • • • = OCA(r 2 n) = • • • 



Fig. 2. The very fine hierarchy for 1-bit OCA. 



4 Two-Way CA 

For two-way CA (CA for short) with 1-bit communications one has the following 
result: 

Lemma 16. Each 1-bit CA recognizing needs at least time = (1 -I- 

l/2)n/2. 

Proof. Consider a 1-bit CA C recognizing and an input length n = 2k. 
Denote by t the worst case computation time needed by C for inputs of length 
n. 

Consider the boundary between cells k = n/2 and fc -I- 1, which separates 
the two halves of an input. The computations in the left half are completely 
determined by the sequence Br of bits received by cell k during the time steps 
1, . . . ,t — n/2 from the right, and the computations in the right half are com- 
pletely determined by the sequence Bi of bits received by cell A: -I- 1 during the 
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time steps 1, . . . , t — n/2 from the left. There are exactly 2*“^ bit sequences Bi 
and 2*“^ bit sequences Br- On the other hand there are = 2*'* left resp. right 
halves of inputs of length n. 

Assume that < 2*'^. Then there would exist two different words v\ 

and V 2 of length k resulting in the same bit strings Bi and Br for the inputs 
vivi and V 2 V 2 - Therefore, in cells 1, . . . , fc the computation of C for the input 
viV 2 would be the same as for viVi and since C has to accept viVi, it would also 
accept viV 2 - Contradiction. 

Therefore > 2*'^, i.e. t — k > ■ k, i.e. t > ^n. 

On the other hand it is not difficult to construct a CA which shows: 

Lemma 17. Lm can he recognized by 1-bit CA in time (/ + ‘i)nj2. 

The straightforward construction of shifting the input symbols in both directions, 
accumulating comparison results everywhere and using the result of the middle 
cell suffices. 

Lemmata 16 and 17 immediately give rise to an infinite hierarchy of com- 
plexity classes, but the gaps are large. For example one has 

Corollary 18. 

CAi(n) C CAi(3n) C CAi(3^n) C CAi{3^n) C... 

In fact the constants can be improved somewhat (using the lemmata above 
ultimately to c^ for any constant c > 2 if j is large enough) . On the other hand 
it is unfortunately not clear at all how to results which are as sharp as for OCA. 

Finally we point to the following relation between communication bounded 
OCA and CA. As mentioned above Lm can be recognized by 1-bit CA in time 
{I -L 2)nj2, but on the other hand it cannot be recognized by 1-bit OCA in time 
{I — e)n. This is a gap of {I — e)n — {I 2)n/2 = {I — 2 — 2e)nj2 which can be 
made arbitrarily largel In other words: 

Lemma 19. For each constant k > 1 there are languages for which 1-bit CA 
can be faster than any 1-bit OCA recognizing it by a factor of k. 

Corollary 20. For no constants r > 1 and k > 1 is CAi (rn) C OCAi(A:rn). 
For no constants r > 1 and k > 1 is CAi(rn) C OCA^ (rn). 

Thus in a sense sometimes the ability to communicate in both directions is more 
powerful than any bandwidth for communication in only one direction. 



5 Conclusion and Outlook 

It has been shown that for all real numbers r > 1 and e > 0 there are problems 
which can be solved on 1-bit OCA in time (r -L e)n, but not in time rn. As a 
consequence there are problems the solution of which on 1-bit OCA must be 
slower than on unlimited OCA by a factor of at least r. 
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It is therefore interesting, and in some way surprising, that certain problems 
which are considered to be nontrivial, e.g. the FSSP, can solved on 1-bit CA 
without any loss of time. 

Two-way CA with the ability to communicate 1 bit of information in each 
direction are more powerful than one-way CA with the ability to communicate 
k bit in one direction. For certain formal languages the latter have to be slower 
by a constant factor which cannot be bounded. 

Our current research on communication restricted CA is mainly concerned 
with two problem fields. One is the improvement of the results for two-way CA. In 
particular we suspect that the lower bound given in Lemma 16 can be improved. 
The other is an extension of the definitions to CA with an “average bandwith” 
of z bits, where 2 > 1 is allowed to be a rational number. We conjecture that 
for OCA there is also a dense hierarchy with respect to the bandwith (while 
keeping the time fixed). This is true if one restricts oneself to integers. For 
rational numbers there a some additional technical difficulties due to the not 
completely straightforward definition of z-hit CA. 
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Abstract. This work studies the notion of locality in the context of 
process speciheation. It relates naturally with other works where infor- 
mation about the localities of a program is obtained information from 
its description written down in a programming language. 

This paper presents a new approach for this problem. In our case, the 
information about the system will be given in semantic terms using asyn- 
chronous transition systems. Given an asynchronous transition system we 
build an algebra of localities whose models are possible implementations 
of the known system. We present different results concerning the models 
for the algebra of localities. In addition, our approach neatly considers 
the relation of localities and non-determinism. 



1 Introduction 

In the framework of the so called true concurrency, the idea of causality has 
been widely studied [13,12,8,7,15]. Localities, an idea somehow orthogonal to 
causality, has become also interesting [1,4,5,10,11,9,3]. Causality states which 
events are necessary for the execution of a new one, while localities observe in 
which way the events are distributed. Both approaches have been shown not to 
be equivalent or to coincide in a very discriminating point [6,17]. 

The idea of the work on localities is to state where an event occurs given 
the already known structure of a process. Thus, the starting point is a process 
written in a clearly defined syntax. For instance, consider the process 

a. c. stop lie c. 6 . stop (1) 

where ||c is the CSP parallel composition: there are two processes running to- 
gether, but they must synchronize in the action c. This process may execute 
actions a@ • [0, 6@0|*, and c@ • [•. The term in the right hand side of the @ 
indicates the places in which the action on the left side of @ occurs. In particular, 
the • shows in which side of the parallel operation the action takes place. Notice 
that a and b do not share any locality: a occurs at the left hand side of the 

* This work is supported by the CONIC YT/BID project 140/94 from Uruguay 
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parallel composition while b occurs at the right hand side. On the other hand, 
the process 

a. c. 6 . stop (2) 

presents the same sequence of actions a, c, and b, although in this case they 
occur exactly in the same place. 

Besides, these works on localities have a nagging drawback: in some cases 
where non deterministic choice and parallel composition are involved, localities 
for actions do not seem to match our intuition. For instance, in the process 

(a. stop II 5. stop) + (c.stop || d.stop) (3) 

we have that a@ • |0 and d@0|*. We could think that a and d do not share any 
resource, but in a causal-based model they are clearly in conflict: the occurrence 
of one of them forbids the occurrence of the other. From a causal point of view 
actions a and d must be sharing some locality. 

The approach we chose is to deduce the distribution of events from the se- 
mantics of a given process. We use asynchronous transition systems [14,16] (ATS 
for short) to describe its behavior. Thus, in our case the architecture (i.e., the 
syntax) of the process is not known. 

Our contribution consists of the statement and exploration of this original 
semantic-based approach. For each ATS we define an algebra of localities with 
a binary operation A that returns the common places of two events, and a con- 
stant 0 meaning “nowhere” . The axioms of this algebra will give the minimal 
requirements needed for events to share or not to share some place. The ax- 
iomatization does not specify anything if such a statement cannot be deduced 
from the behavior. Thus, given the interpretation of the processes (1) and (2) 
we may deduce that a and c must have some common place, and we will write 
a A c 0. However, the axiomatization is not going to state whether a A b = 0 
or a A 6 0. This will depend on the model chosen for the axiomatization, that 

gives the definitive criterion for the distribution of events: our models will be 
true implementations of ATS. We will show that our approach detects situations 
like the one described in process (3). In this case, we will have an explicit axiom 
saying that a and d share some common place, i.e, a A d ^ 0. 

In addition, we discuss different models for the algebra of localities of a given 
ATS. These models may be associated to a program whose specification was 
given in terms of the original ATS. First we introduce the non-independence 
models which consider whether two events are independent in the corresponding 
ATS. Then, we define models which take into account whether two events are 
adjacent. 

Consider two events sharing a locality in a model Ai for a given ATS. If they 
share some locality in every possible model for this ATS, we call M a minimal 
sharing model. On the other hand, if two events share a locality in Ai only when 
they share a locality in any other model, then we call A\ a maximal sharing 
model. We show that the models concerning adjacency introduced in this work 
hold one of these properties. 
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The paper is organized as follows. Section 2 recalls the definition of ATS as 
well as some notions of graph theory. Section 3 introduces the algebra of locali- 
ties. Six models for this algebra are presented in Section 4. Finally, conclusions 
and future works are given in Section 5. 

2 Preliminaries 

Asynchronous Transitions Systems Asynchronous transition systems [14,16] 
are a generalization of labeled transition systems. In ATSs, transitions are la- 
beled with events, and each event represents a particular occurrence of an action. 
In addition, ATSs incorporate the idea of independent events. Two independent 
events can be executed in parallel, and so they cannot have resources in common. 
Formally, we define: 

Definition 1. Let A = {a, /3, 7 , . . .} be a set of actions. An asynchronous tran- 
sition system is a structure T = {S,E,I, — >,t) where 

— S = {s, t, s' , . . .} is a set of states and E = {a, b,c , . . .} is a set of events; 

— I C E X E is an irreflexive and symmetric relation of independence. We 
write alb instead of (a, b) G /; 

>C S X E X S is the transition relation. We write s s' instead of 

{s,a,s') G — )>; 

— £ : E ^ A is the labeling function. 

In addition, T has to satisfy the following axioms, 

Determinism: s s' A s s" s' = s" 

Forward stability: alb A s-^s' A s-^s" 3t £ E. s' -^t A s" -^t 

Commutativity: alb As s' As' -^t 3s" £ E. s-^s" As" -^t 

□ 

Example 1. In the Introduction we have mentioned a couple of examples. We 
are going to use them as running examples. To simplify notation, we use the 
same name for events and actions. 

We can represent both a.c.&.stop and a.c.stop jjc c.6.stop by the ATS in 
Figure 1. Notice that for the second process, we could have alb although that is 
not actually relevant. However, it is important to notice that ~<{alc) and ~'{blc) 
in both cases. 

The ATS for process (a. stop || 6. stop) -|- (c.stop || d.stop) is depicted in 
Figure 2. Notice that alb and cld while any other pair of events is not indepen- 
dent. Shadowing is used to show the independence relation between events. □ 



Graphs A graph G consists of a finite set V of vertices together with a set X 
of unordered pairs of distinct vertices of V. The elements of X are the edges of 
G. We will note {w,w} G A as vw. We will write (V,X) for the graph G. Two 
vertices v and w are adjacent in G if vw £ X. Two edges e and / are adjacent 
if e n / 0. 
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Fig. 2. The ATS for (a. stop || 6. stop) + 
Fig. 1. The ATS for a.c.&.stop and (c.stop || d.stop) 

a.c.stop ||c c.&.stop 




Definition 2 (Subgraphs). We call H = a subgraph of G = (V,X), 

and note H f- G, whenever V' GV and X' C X. We write PG for the set of 
all subgraphs of G. We write PPG for the power set of PG. □ 

A clique of a graph G is a maximal complete subgraph of G. As a complete 
graph is defined by its vertices, we will identify a clique with its corresponding 
set of vertices. We write K{G) for the set of cliques of the graph G. 

Lemma 1. Let v and w be two vertices of G = {V,X). Then, vw € X iff there 
exists a clique K G K{G) such that vw € X(K). 

3 The Algebra of Localities 

In this section we explain how to obtain an algebra of localities from a given 
ATS. The algebra of localities is constructed over a semilattice by adding some 
particular axioms for each ATS. 

Definition 3. A semilattice is a structure (£, A, 0) where A : £ x £ ^ £ and 
Q € £ satisfying the following axioms: 

a A b = b A a ( commutativity) a A {b A c) = {a A b) A c ( associativity) 

a A a = a (idempotence) a A 0 = 0 (absorption) 

□ 

Each element in the set £ refers to a set of “places”. In particular, 0 means 
“nowhere”. The operation A gives the “common places” between the operands. 
The axioms make sense under this new nomenclature. Commutativity says that 
the common places of a and b are the same as the common places of b and a. 
Associativity says that the common places of a, b, and c are always the same 
regardless we consider first the common places of a and b, or the common places 
of b and c. According to idempotency, the common places of a and itself are 
again the places of a. Finally, absorption says that any element of £ has no 
common place with nowhere. 
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Now we introduce the concept of adjacent events. Two events are adjacent 
if they label two consecutive transitions, or two outgoing transitions from the 
same state. 



Definition 4. Let T = {S,E,I, — >,£) be an ATS. Two events a,b £ E are 
adjacent in T, notation adj{a,b), if and only if there exist s, s',s" € S such that 

o- , r b 1/ b f a If j b if 

s — i-s — i-s or s — is — is or s — is and s — is 



□ 

We are interested in independence relation between adjacent events. When 
two events are not adjacent an observer cannot differentiate whether they are 
independent. For instance, in the ATS of Figure 1 it is not relevant whether a 
and b are independent since that does not affect the overall behavior. 

The carrier set of the algebra of localities associated to an ATS includes an 
appropriate interpretation of its events. Such an interpretation refers to “the 
places where an event happens” . 

Definition 5. Let T = {S,E,I , — i, be an ATS. The algebra of localities 
associated to T is a structure A = {C, E, A,0) satisfying: 

1. E C C, and (£,A,0) is a semilattice 

2. aLb and adj{a,b) aAb = 0 

3. -'{aLb) and adj{a,b) aAb^O 

□ 

Example 2. For the ATS of Figure 1 we obtain the following axioms: 
oAcy^O cA6y^0 

Notice that the axiom system does not say whether a A b ^ 0 or a A b = 0. 
Thus, the algebra does not contradict the decision of implementing the ATS 
either with process a.c.6.stop, in which a and b occur in the same place, or with 
a. c. stop lie c.&.stop, in which a and b occur in different places. 

For the ATS of Figure 2 we obtain the following axioms: 

aAb = 0 aAcy^O bAc^O 

cAd=0 aAd=f=0 6 A d 0 

Notice that the axioms state that a and d must share some places. On the other 
hand, as we already said, other approaches to localities cannot identify such a 
conflict . □ 

4 Models for the Algebra of Localities 

In this section we introduce several models for the algebra of localities associated 
to a given ATS, thus proving its soundness. Each of our models may be an 
implementation. The interpretation for the events will be based on the relations 
of independence and adjacency. The names of the models are taken from these 
basic relations. 
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b C b C C d 



Fig. 3. I models for a.c.fe.stop and Fig. 4. I model for (a. stop || fe.stop) + 

a.c.stop ||c c.&.stop (c.stop || d.stop) 



The I Models The non-independence models (I models for short) for the al- 
gebra of localities associated to a given ATS assign common places to non- 
independent events. We define the non-independent models I and 12, based on 
cliques and edges respectively. 

Let T = {S,E,I, — >,£) be an ATS. We define the graph = {E,{{a,b} C 
E I -'{alb)}). We define the interpretation of an event a in the model I (12) to 
be the set of cliques (edges) in in which a appears. 

I«F = G K{G^) I a G A} ([«f 2 drf g I ^ g 

Each set A G |o]^ is a different place where a may happen: each place is 
identified with the set of all events that can happen there. Moreover, an event 
can happen in several places simultaneously. The operation A of the algebra of 
localities is interpreted as the intersection fl between sets, and the constant 0 is 
interpreted as the empty set 0. 

Example 3. For the ATS in Figure 1 with alb, we obtain the graph G^ on the 
left of Figure 3. This implementation uses two places or localities. One of them is 
shared by a and c, and the other by b and c. So, this model is well suited for the 
implementation a.c.stop ||c c. 6. stop. In this case, both I and 12 interpretations 
coincide. These could be written down as 

[af = {{a, c}} I&f = {{b, c}} [cf = {{a, c}, {b, c}} 

We have a new interpretation in case a and b are not independent. We can 
see it on the right of the Figure 3. Now, every event occurs in the same place. 
In other words, if -•{alb), the I model implements a.c.6.stop. 

laY = lbf = lcf = {{a, b,c}} 

A different interpretation is established for model 12. In this case, we have 

lay^ = {{a,c},{a,b}} Ib^^ = {{b,c},{a,b}} = {{a,c},{b,c}} 

This model implements the program a. 6 . stop || a.c.stop || c. 6 . stop that uses 
three localities. 

For the ATS of Figure 2 we have G^ depicted in Figure 4. The execution of 
a requires two places, one shared with c and the other with d. Thus, the event 
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a prevents the execution of d by occupying a place required by this event. This 
reflects the fact that selection between non independent events occurs actually 
in a place. For this implementation, we have 

|af = {{a, c}, {a, d}} = {{b, c}, {6, d}} 

|cf = {{a, c}, {b, c}} = {{a, d}, {b, d}} 

□ 

Now we prove that non-independence models are indeed models for the al- 
gebra of localities. 

Theorem 1 (Soundness). LetT = — >,£) be an ATS, A its algebra of 

localities, and | a € E}. Then, 

=' {PP (G^) , lEf, n, 0) and =' {PP (G^) , [Af ^ n, 0) 

are models for A. 

Proof. By deflnition, {Ey^ C PP(G^). Moreover, (PP (G-^) , fl, 0) is a well 
known semilattice. 

Suppose that alb and adj{a, b). They are not adjacent in G^, and so there is 
no edge between a and b in G^^. Thus, |a]^^ fl = 0. 

Finally, suppose that ~'{alb) and adj{a,b). Then, ab G X{G^), and hence 

I«f 2 n i^j/2 ^0, 

The proof for model I is similar, taking into account Lemma 1 . □ 

We can see that, although localities may change, the relation between these 
two models remain substantially unchanged. More explicitly, two events sharing 
resources in any of these models will share resources in the other. 

Theorem 2. P a A 6 p 0 if and only if \= a Ab ^ 0 

Minimal Sharing Models: IJ and IJ2 In the models IJ and IJ2 we assign 
common places to events that are both adjacent and non-independent. We will 
show they are minimal sharing in the following sense : whenever two events share 
a place for this models, they will share a place in any other model. 

Let T = {S,E,I, — >,£) be an ATS. Taking adjacent events into account we 
deflne the graph G^'^ = {E,{{a,b} C E \ -•{alb) and adj{a,b)}). As before, we 
define the interpretation of an event a to be the set of cliques or edges in G^*^ 
where a appears. 

|af = {A G iL(G") I a G A} |of ^2 drf g x{G^p | a G A} 

Theorem 3 (Soundness). Let T = {S,E,I , — >,£) be an ATS and let A be its 
algebra of localities. Then, 

“pf (PP [G^J) , lEfP n, 0) and (PP (G") , {Ef-^^, n, 0) 

are models for A. 
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Theorem 4. \= a A b ^ 0 if and only if ^ a A 6 ^ 0 

These models enjoy the following property: if two events are distributed (i.e., 
do not share a place) in some model for the algebra of localities of a given ATS, 
they are also distributed in these models. This justifies calling them minimal 
sharing models. The following theorem states the counter positive of that prop- 
erty. 

Theorem 5. Let T = — >,£) be an ATS and let A be its algebra of 

localities. Let A4 be any model for A. Then, for all events a,b G E, 

M^-’^ ^aA6yfO =» M\=aAb^Q 

Proof. Suppose ^ aAb yf 0, that is yf 0. Thus ab € 

which implies -•{aLb) and adj{a,b). So, by Definition 5, A h a A 6 yf 0. Hence, 
for any model M oi A, M \= a A b ^ 0. □ 

An easy application of Theorem 4 give us this corollary: 

Corollary 1. is a minimal sharing model. □ 

Maximal Sharing Models: InJ and InJ2 In a similar way we construct a 
model of maximal sharing. In this case, two events share places unless they must 
execute independently. We call them InJ models because they may require non 
adjacency. 

Let T = (S, E, I, — £) be an ATS. We define the graph = (^E, {{a, b} C 
if I -1 ( aLb and adj{a,b) )}). We define the interpretation of an event a to be 
the set of cliques or edges in where a appears. 

drf 1^ g I a G A} |of ”-^2 d£f g X{G^^J) I a G A} 

Theorem 6 (Soundness). Let T = {S,E,I, — >,£) be an ATS and let A be its 
algebra of localities. Then, 

Xiinj n, 0) and 

X4lnJ2 drf ^pp ^ 

are models for A. 

Theorem 7. ^ a A 6 yf 0 if and only if ^ a A 6 yf 0 

This model describes maximal sharing in the sense that if two events are 
distributed in it, they are distributed in any other model. The following theorems 
state this property for the InJ models. 

Theorem 8. Let T = {S,E,I, — >,£) be an ATS and let A be its algebra of 
localities. Let Ai be any model for A. Then, for all events a,b G E, 

= 0 ^ Ai^aAb = 0 
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Corollary 2. is a maximal sharing model. □ 

Example 4- We can see on the right of Figure 3 the graph for the ATS 

of Figure 1, no matter whether a and b are independent. Thus, we obtain the 
following interpretation in the maximal sharing model. 

Thus, \= a A b ^ 0. However, from Example 3 we know that when alb, 

A4^ 1= a A 5 = 0. So, we have that I models are not maximal sharing models. 

We have that for the same ATS when a and b are not independent, \= 
a Ab ^ 0 and Al^'^ ^ a A 6 = 0. Thus, I models are not minimal sharing models 
either. □ 

5 Conclusions 

In this work we have exploited the information about local- 
ities hidden in the ATS definition. Such information helps 
us to find implementations of systems with certain prop- 
erties, like maximal or minimal sharing of localities. 

The way to state how the locality of events are related 
is by means of the algebra of localities. We have introduced 
several models for this algebra and showed that this is not 
a trivial set of models. Figure 5 summarizes our result in 
Section 4. The up-going arrows in the picture mean that Fig. 5. Models of lo- 
sharing on the lower models implies sharing on the upper calities 
models. 

We also have shown that our semantic approach exposes clearly difficulties 
arisen in syntactic language oriented approaches when dealing with non deter- 
ministic choices. 

As a consequence of this work we can extract locality information from a 
specification written in terms of ATS. So, ATS formalism appears as a good 
candidate to become a theoretical assembler for distributed programming. At 
least, there are three interesting directions to continue this work. One of them is 
to go on a deeper comprehension of locality models. The nature of the hierarchy 
of models seems far away from being trivial, requiring more detailed studies on 
its structure. We believe that research in this direction will allow us to detect 
not only minimal sharing models, but also models with some constraints which 
require less localities to work. 

We may develop the same strategy for other semantic formalisms, that is, 
to associate an algebra of localities and to obtain a model as before. Event 
structures [12], from where the notion of independence can be easily derived, 
would be a good candidate to study. 

Another direction for future work would be to extend ATS with new char- 
acteristics. Time is a natural factor to consider in this extensions, as far as 
resources are used for events during certain time. A relation between a not yet 
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defined timed ATS and timed graphs [2] would enable us to move into timed sys- 
tems, where tools and methods for automatic verification have been developed. 

Another way for continuing our work is the development of a toolkit for de- 
scription of systems based in ATS. We believe that semantic studies in program- 
ming must come together with software development, and so implementation of 
good toolkits for both theoretical and practical developments will become more 
important in future. 
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Abstract. We investigate extensions of CTL allowing to express quantitative re- 
quirements about an abstract notion of time in a simple discrete-time framework, 
and study the expressive power of several relevant logics. 

When only subscripted modalities are used, polynomial-time model checking is 
possible even for the largest logic we consider, while introducing freeze quantifiers 
leads to a complexity blow-up. 



1 Introduction 



Temporal logic is widely used as a formal language for specifying the behaviour of 
reactive systems (see [7]). This approach allows model checking, i.e. the automatic 
verification that a finite state system satisfies its expected behavourial specifications. 
The main limitation to model checking is the state-explosion problem but, in practice, 
symbolic model checking techniques [5] have been impressively successful, and model 
checking is now commonly used in the design of critical reactive systems. 

Real-time. While temporal logics only deal with “before and after” properties, real- 
time temporal logics and more generally quantitative temporal logics aim at expressing 
quantitative properties of the time elapsed during computations. Popular real-time logics 
are based on timed transition systems and appear in several tools (e.g., HyTech, Uppaal, 
Kronos). The main drawback is that model checking is expensive [2,4]. 

Efficient model checking. By contrast, some real-time temporal logics retain usual dis- 
crete Kripke structures as models and allow to refer to quantitative information with 
“bounded” modalities such as “AF<io A” meaning that A will inevitably occur in at 
most 10 steps. A specific aspect of this framework is that the underlying Kripke structures 
have no inherent concept of time. It is the designer of the Kripke structure who decides 
to encode the flow of elapsing time by this or that event, so that the temporal logics in use 
are more properly called quantitative temporal logics than real-time logics. [8] showed 
that RTCTL (i.e. CTL plus bounded modalities “A_U<fe and “E_U<fc in the Kripke 
structure framework) still enjoys the bilinear model checking time complexity of CTL. 
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Our contribution. One important question is how far can one go along the lines of 
RTCTL-liks logics while still allowing efficient model checking ? Here we study two 
quantitative extensions of CTL, investigate their expressive power and evaluate the 
complexity of model checking. 

The first extension, called TCTLg, s for “subscripts”, is basically the most general 
logic along the lines of the RTCTL proposal : it allows combining “< k”, “> k” and 
“= k” (so that modalities counting w.r.t. intervals are possible). We show this brings real 
improvements in expressive power, and model checking is still in polynomial time. This 
extends results for RTCTL beyond the increased expressivity: we use a finer measure 
for size of formula (EF=fc has size in 0(log k) and not k) and do not require that one 
step uses one unit of time. 

The second extension, called TCTLc, c for “clocks”, uses formula clocks, a.k.a. 
freeze quantifiers [3], and is a more general way of counting events. TCTL^ can still be 
translated directly into CTL but model checking is expensive. 

The results on expressive power formalize natural intuitions which (as far as we 
know) have never been proven formally, even in the dense time framework ^ . Further- 
more, in our discrete time framework our results on expressive power must be stated 
in terms of how succinctly can one logic express this or that property. Such proofs are 
scarce in the literature (one example is [13]). 

Related work. TCTL^ and TCTLc are similar to (and inspired from) logics used in 
dense real-time frameworks (though, in the discrete framework we use here, their be- 
haviour is quite different). Our results on complexity of model checking build on ideas 
from [6,11,4,10]. 

Other branching-time extensions of RTCTL have been considered. Counting with 
regular patterns makes model checking intractable [9]. Merging different time scales 
makes model checking NP-complete [10]. Allowing parameters makes model checking 
exponential in the number of parameters [10]. 

Another extension with freeze variables can be found in [ 14] where richer constraints 
on number of occurrences of events can be stated (rending satishability undecidable). 
On the other hand, the “until” modality is not included and the expressive power of 
different kinds of constraints is not investigated. 

Plan of the paper. We introduce the basic notions and definitions in § 2. We discuss 
expressive power in § 3 and model checking in § 4. We assume the reader is familiar with 
standard notions of branching-time temporal logic (see [7]) and structural complexity 
(see [12]). Complete proofs appear in a full version of the paper, available from the 
authors. 

2 CTL + Discrete Time 

We write N for the set of natural numbers, and AP = {A, B, . . .} for a finite set of 
atomic propositions. Temporal formulae are interpreted over states in Kripke structures. 
Formally, 

* See e.g. the conjecture at the end of [1] which becomes an unproved statement in [2]. 
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Definition 2.1. A Kripke structure (a “KS”) is a tuple S = {Qs, Rs, ^s) where Qs = 
{gi, . . .} is a non-empty set of states, Rs Q Qs x Qs is a total transition relation, and 
is • Qs 2^^ labels every state with the propositions it satisfies. 

Below, we drop the “S” subscript in our notations whenever no ambiguity will arise. A 
computation in a KS is an infinite sequence tt of the form q^qi . . . s.t. (gj, qi+i) G R 
for all i G N. For i G N, ir{i) (resp. 7rp) denotes the z-th state, qi (resp. z-th prefix: 
qoqi , . . . , qi). We write U{q) for the set of all computations starting from q. Since R is 
total, n{q) is never empty. 

The flow of time. We assume a special atomic proposition tick G AP that describes the 
elapsing of time in the model. The intuition is that states labeled by tick are states where 
we observe that time has just elapsed, that the clock just ticked. Equivalently, we can 
see all transitions as taking 1 time unit if they reach a state labeled by tick, and as being 
instantaneous otherwise In pictures, we use different grey levels to distinguish tick 
states from non-tick ones. 

Given a computation tt = go<?i • • • and z > 0 , Time(7T|j) denotes \{j \ 0 < j < 
i A tick G l{qj)}\, the time it took to reach qi from go along tt. 

2.1 TCTL^ 

Syntax. TCTLs formulae are given by the following grammar: 

p, Tp ::= ->ip I (/? A z/) I EXi^ | E.ip\Ji ip \ AipUi ip \ A \ B \ ... 
where I can be any finite union [ai, 6i[U • • • U [a„, of disjoint integer intervals with 

0 < Oi < 6i < 02 < 62 < ■ • • On < < W. 

Standard abbreviations include T , A, ip \/ ip, ip => z/>, . . . as well as EF/ p (for 
ETU/ p), AFj p (for ATU/ p), EG/ p (for ^AF/ -•p), and AG/ p (for ^EF/ ->p). 

Moreover we let U<fc stand for U[o,fe[ , U>fc for U[fc+i , and U^fc forUffc ^+q. 
The usual CTL operators are included since the usual U corresponds to U<(^ . 

Semantics. Figure 1 defines when a state g in some KS S, satisfies a TCTLs formula 
p, written q \= p,hy induction over the structure of p. 

We let TCTLs[<], TCTLs[<, =], etc. denote the fragments of TCTLs where only 
simple constraints using only < (resp. < or =, etc.) are allowed. E.g., RTCTL is 
TCTLs[<] (with the proviso that our KS’s have tick’s). 

2.2 TCTL^ 

TCTLc uses freeze quantifiers [3]. Here “clocks” are introduced in the formula, set to 
zero when they are bound, and can be referenced “later” in arbitrary ways. This standard 
construct gives more flexibility than subscripts. 

^ Thus KS’s with Zick’s can be seen as dwcreZe rimed iZnzcto/ei, i.e. KS’s where edges (g, g') G R 
are labeled by a natural number: the time it takes to follow the edge. While discrete timed 
stmctures are more natural, KS’s with tick are an essentially equivalent framework where 
technicalities are simpler since they do not need labels on the edges. 
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q\= A iff A e l{q), 

q 1= ^ip iff qV^p, 

q\= p Alp iff q \= ip and q \= p), 

q 1= EX</9 iff there exists tt G J7(g) s.t. tt |= Xp, 

q 1= E</9U/ Ip iff there exists n £ II {q) s.t. tt |= (pUj ip 

q 1= ApUi Ip iff for all tt G II (q), we have tt \= p\Ii ip 

TT 1= Xp iff ■Tr(l) 1= p, 

TT 1= p\Ai Ip iff there exists i > 0 s.t. Time(7T|i) G I 

and Tr{i) |= ip and Tr{j) \= p for all 0 < j < i. 

Fig. 1. Semantics of TCTLa 

Syntax. For a set = {x,y, ...} of clocks, TCTLc formulae are given by the following 
grammar: 



p, xp ::= —>p \ p A xp \ EX(p | E.pfixp \ Apfixp \ x in p \ x ^ k \ A\ B \ ... 

where ~G {=, <, <, >, >} and A: G N. Constraints referring to clocks are restricted to 
the simple form a; ~ fc, in the spirit of TCTLg. 

An occurrence of a formula clock x in some a: ~ A: is bound if it is in the scope 
of a “x ^ ” freeze quantifier, otherwise it is free. A formula is closed if it has no free 
variables. Only closed formulae express properties of states in KS’s. 



Semantics. TCTLc formulae are interpreted over a state of a KS S together with a 
valuation v : Cl ^ N of the clocks free in p. 



q,v 


= A iff A G l{q). 


q,v 


= -•p iff q,v p. 


q,v 


= p Axp iff q,v \= p and q,v \= xp. 


q,v 


= EX</9 iff there exists tt G II (q) s.t. tt, v |= Xp 


q,v 


= ^piixp iff there exists tt G II (q) s.t. tt, v |= pflxp 


q,v 


= ApUxp iff for all tt G II (q) we have tt, v |= pUxp 


q,v 


= X inp iff q, v[x 0] |= 


q,v 


= X ^ k iff v{x) ~ k 


TT, V 


— Xp iff 7t(1), V + d\= p with d = Time(7r|i) 


TT, V 


= pUxp iff there exists i > 0 s.t. Tr{i),v + di \= xp and 




dsf 

V + dj \= p for all 0 < j < i (where di = Time(7T|i)) 



Fig. 2. Semantics of TCTLc 



Figure 2 defines when q,v \= pm some KS S by induction over the structure of p. 
For m G N, V + m denotes the valuation which maps each clock x G Clio the value 
v{x)+m, and v\x 0] is u where now x evaluates to 0. 

Clearly the TCTL^ operators can be defined with TCTLc operators: 

E(pU/ a; in ^E(/3U(/(a;) A xp)^ ApUi xp^= x in (^Ap\J{I{x) A xp)'j 
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where, for I of the form [ai, 5i[U • • • U [a„, &„[, I{x) denotes the clocks constraint 

n 

V(( Oi<x) A {x<bi)^. Hence TCTLg can be seen as a fragment of TCTLc where 



i=l 



only one formula clock is allowed (and used in restricted ways). 



A standard observation for logics such as TCTLc is that the actual values recorded 
in V are only relevant up to a certain point depending on the formula at hand. Let 
denote the largest constant appearing in Lp (largest k in the “x ~ fc”’s) and, for to G N, 
let V =m v' when for any x G Cl, either v{x) = v'{x) or v{x) > m < v'{x) (i.e. v and 
v' agree, or are beyond to). 

Lemma 2.2. Ifv =m v' and m > M^p, then q,v \= (p ijf q, v' |= Lp. 

Proof. Easy induction over the structure of tp, using the fact that v =m v' entails 
V + k =m v' + k and v[x t— 0] =m v'[x 0]. □ 



Remark 2.3. A related property is used by Emerson et al. in their study of RTCTL: 
when checking whether q \= p inside some KS with \Q \= m states, it is possible to 
replace by to any constant k larger than to in the subscripts of p. We emphasize that this 
property does not hold for TCTLs[=] (it does hold for TCTLs[<, >]). □ 

The size of our formulae is the length of the string ^ used to write them down in 
a sufficiently succinct way, e.g., | AaU/ /? | is 1+ | a | + | /3 | + | / |. Eor I of 

dsf 

the form [m, 6i[U • • • U [a„, 6„[, we have | / | = [logoi] + • • • + [log &„] (assuming 
log(O) = log(w) = 0). ht{p) denotes the temporal height of formula p. As usual, it is 
the maximal number of nested modalities in p. Obviously, ht{p) is smaller than the size 
of p (even when viewed as a dag). 

3 Expressivity 

Eormally, TCTLg or TCTLc do not add expressive power to CTL: 

Theorem 3.1. Any closed TCTLc ( or TCTLg) formula is equivalent to a CTL formula. 

Proof. With any TCTLc formula p, and valuation v, we associate a CTL formula {pY 
s.t. < 7 , u ^ 1 ^ iff <7 1= {pY for any state q of any Kripke structure. Then, if p has no free 
clock variables, any {pY is a CTL equivalent to p. The dehnition of {pY is given by 

, , def f T if v(x) ~ k, 

(X ~ k) = ( , , ' . 

( _L otherwise 

{x in pY 

^ We sometimes see a formula as a dag, where identical subformulae are only counted once. 
Such cases are stated explicitly. 



the following rewrite rules: 



Hpf 

{p A fY = P" A 

YpY 
{AY A 




442 



F. Laroussinie, Ph. Schnoebelen, and M. Turuani 










if u + 1 =M^ V, 

A(^tick) U ( {-^tick A ip'’) V {tick A (AF^)'""'"^) 



ifu + 1 V, 



otherwise 



(p"A EX[E{p'’A^tick) u ( (y^’A^tick) V {tick A {Epuyy+y 

otherwise 



This gives a well-founded definition for (_)" since in the right-hand sides either (_)’^ 
is recursively applied over subformulae, or is applied on the same formula (or 

both). But moving from (_)" to is only done until v =m u -F 1, which is bound 

to eventually happen. Then it is a routine matter to check that the correctness invariant 
(i.e., “q, V \= tpiff q\= {tpY”) is preserved by these rules. □ 



The translation we just gave is easy to describe but the resulting {tpY formulae have 
enormous size. It turns out that this cannot be avoided. Even more, we can say that 
moving from CTL to TCTLs\<] to TCTLs to . . . allows writing new formulae that 
have no succinct equivalent at the previous level. 

Theorem 3.2. 1. TCTLy<\ can be exponentially more succinct than CTL, 

2. TCTLY<, >] can be exponentially more succinct than TCTLs[<], 

The proof is given by the following lemmas. 

Lemma 3.3. Any CTL formula equivalent to EF<„ A (a log n-sized formula) has tem- 
poral height at least n. 



Proof. Consider the KS described in Figure 3. One easily shows (by structural induction 
over tp) that for any CTL formula ip, ht{p) < i implies aiYp iff oti+i |= T- On the 




Fig. 3. a„ 1= EF<„+i A and a„+i Y EF<„+i A 



ai \= tick A —>A 
^ 1= —<tick A A 



other hand, aj Y EF<„ Al iff j < n. Thus any CTL equivalent to EF<„ A must have 
temporal height larger than n. □ 



Lemma 3.4. Any TCTLs[<\ formula equivalent to EF>„ a 1 (a log n-sized formula) has 
temporal height at least n. 

Proof. Consider the KS described in Figure 4. One easily shows (by structural induction 
over p) that for any formula p in TCTLs\<], ht{p) < i implies ai Y T iff cti+i 1= T 
and f3i Y T iff A-ti Y T- On the other hand, aj Y EF>„ ^ iff j > n. Thus any 
TCTLs[<] equivalent to EF>„ A must have temporal height larger than n. □ 

Let us mention two (natural) conjectures that would allow separating further frag- 
ments: 
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Oin Oin-l Qo 

cx^ 1= tick A —>A 
0i \= ~'tick A -lA 
7 1= —^tick A A 

Fig. 4. an ^ EF>„ A and a„+i \= EF>n A 

Conjecture 3.5. 1. TCTLs[<,>,=] can be exponentially more succinct than 

TCTL,[<,>], 

2. TCTLc can be exponentially more succinct than TCTLg. 

We have not yet been able to find the required proofs, which are hard to build. The first 
point is based on the conjecture that any TCTLg[<, >] formula equivalent to Ahas 

temporal height at least k. For the second one, we conjecture that any TCTLs formula 
equivalent to x dm EF{A A EF(i? A X = /c)) has size at least k. 

We have explained how TCTLs becomes more and more expressive when we allow 
subscripts with <, then also with >, then also with =. Subscripts of the form “= k” are 
the main difference between RTCTL and our proposal. They enhance expressivity and 
make model checking more complex (see § 4). 

Once we have TCTLs\<, >, =], subscripts with intervals are just a convenient short- 
hand: 




Theorem 3.6. TCTLs is not more succinct than TCTLs[<, >, =]. 

Proof. For / of the form „[aj, we denote hy /—fc the set IJ.^^ ^[ai—k,bi—k[ 

(after the obvious normalization if k > ai). 

Let (/?be a TCTLs formula. We build an equivalent TCTLs[<, >, =] formula with 
the following equivalences: 

EaUi/3= V EaU.a7EaU<6,-a, 7 

i=l...n 

( Aa U^ai (A a U/_ai P) if ai > 0, 



AaUi P 



^E(-./?)U<(,i (-■« A->P) 

« .1 /A . . \ Otherwise 

A -'E(-i/3)U=6i ( -iA a U=a 2 - 6 i (A a P) 1 



Correctness is easy to check. The size of (f, seen as a dag, is linear in the size of tp seen 
as a dag *. □ 



4 Model Checking 

For the logics we investigate, the model checking problem is the problem of computing 
whether q \= p fox q a. state of a KS S and p a temporal formula. In this section we 
analyse the complexity of model checking problems for TCTLs and TCTL^. 

Viewing formulae as dags is convenient here, and agree with our later use of Theorem 3.6 when 
we investigate efficient model checking for TCTLs. 
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Given a KS S and a formula ip, the complexity of model checking can be evaluated 
in term of | S' | and \(p\. But more discriminating information can be obtained by also 
looking at the program complexity of model checking (i.e., the complexity when ip is 
fixed and S, q is the only input) and the formula complexity (i.e., when S, q is fixed and 
p is the only input). 

While TCTLs model checking can be done efficiently, this is not true for TCTLc 
(even when considering a fixed KS). 

Theorem 4.1. Let S = {Q, R, 1) be a KS and p a TCTLg formula. There exists a model 
checking algorithm running in time 0((|(5p + |i?|)x |v?|) - Moreover ifp belongs to 
TCTLs[<,>], the algorithm runs in time Oiy{\Q\ + |i?|)x |(/3|). 

Proof (Idea). The algorithm extends the classical algorithms for CTL and RTCTL 
(see [8]) with procedures dealing with TCTLs\<,=, >] operators (as seen in Theo- 
rem 3.6, formulae with interval subscripts can be decomposed). The most expensive 
procedure concerns the EU= case where we compute the transitive closures of rela- 
tions, hence the (quite naive) 0{\ Q |^ + | |). The TCTLf^<, >] fragment uses only 

procedures in 0((| (5 1 -F I i?|) X |v?|)- □ 



Theorem 4.2. The model checking problem for TCTLc is PSPACE-complete. The for- 
mula complexity of TCTLc model checking is PSPACE-complete. 



Proof. To prove this result, it is sufficient to show that TCTLc model checking is in 
PSPACE ^ and that the formula complexity is PSPACE-hard. The proof of this last point 
relies on ideas from [4]: let P be an instance of QBE (Quantified Boolean Formula, a 
PSPACE-complete problem). W.l.o.g. P is some Q\Pi . . . QnPn-P (with Qi G {3, V} 
and p a propositional formula over pi, . . . ,p„). We reduce P to a model checking 
problem S,q\= <I> where S is the simple KS ({g}, {q -G q}, {l{q) = tick}) and <P is the 
following TCTLc formula; 



t in EF 



f = 1 A Oi ^ EF t = 2 A ... (^t = i A Oi(xi ^ EF(f = i-|-l A . . . 



EF(f = n-|-l A 1 ^) . . . 



where Oi is EF<i (resp. EG<i ) if Qi is 3 (resp. V) and p is p where occurrences of 
Pi have been replaced by = n 1 — i. Observe that any clock Xi is reset at time i 
or z -F 1 and depending on this reset time the atomic propositions pi will be interpreted 
as true or false after the n-\-l-th transition. The operator EF<i (resp. EG<i ) allows to 
quantify existentially (resp. universally) over these two reset times. Clearly is valid iff 
S,q \= (p. □ 

In practice, one can easily use any CTL model checker for model checking TCTLc 
formulae, and the resulting algorithm runs in time 0{\ S \ . \ p\). For example, 

with SMV, one just adds one variable for each formula clock and update them in the 
obvious way. This is much more practical than an approach based on Theorem 3.1 and 
the complexity is not too frightening for formulae with 167(1= 1 (only one clock), a 
fragment already more expressive than TCTLs. 

^ This uses standard arguments, see the long version for details. 
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A theoretical view. The following table gives a synthetic summary of complexity mea- 
sures for model checking CTL, TCTL^ and TCTLc, showing that model checking the 
full TCTLs is as tractable as model checking CTL in both arguments. On the other hand, 
model checking TCTLc requires polynomial space even for a fixed Kripke structure. 



Complexity of model checking 


CTL TCTLs 

P-complete 


TCTLc 

PSPACE-complete 


Formula complexity 


LOGSPACE 


PSPACE-complete 


Program complexity 


NLOGSPACE-complete 



Filling the table. Model checking TCTLs is in P as we just saw. P-hardness results from 
the obvious reading of the circuit-value problem (with proper alternation) as a model 
checking problem for the EX fragment of CTL. The formula complexity of model 
checking CTL is LOGSPACE and this result can be easily extended to TCTLg. The 
program complexity of model checking TCTLs and TCTLc is NLOGSPACE-complete 
since we proved (Theorem 3.1) that these logics can be translated into CTL, for which 
the NLOGSPACE-complete complexity is given in [11]. 

Symbolic model checking. When it comes to symbolic model checking (i.e., when S is 
given under the form of a synchronized product of k structures Si, ... , Sk), CTL model 
checking becomes PSPACE-complete [1 1], this is also true for TCTLs and TCTLc'. 

Theorem 4.3. The symbolic model checking problem for TCTLs and TCTLc is 
PSPACE-complete. 

5 Conclusion 

We investigated the expressive power and the complexity of model checking for TCTLs 
and TCTLc, two quantitative extensions of CTL along the lines of RTCTL [8,10]. 

The expressive power must be measured in a framework where, strictly speaking, 
everything can be translated into CTL. 

We showed that TCTLs, while more succinct than RTCTL, still allows an effi- 
cient model checking algorithm. By contrast TCTLc, the extension of CTL with freeze 
quantifiers leads to a complexity blow-up. 
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Abstract. The notion of contextual equivalence is fundamental in the theory of 
programming languages. By setting up a notion of bisimilarity, and showing that 
it coincides with contextual equivalence, one obtains a simple coinductive proof 
technique for showing that two programs are equivalent in all contexts. In this 
paper we apply these (now standard) techniques to interactions nets, a graphical 
programming language characterized by local reduction. This work generalizes 
previous studies of operational equivalence in interaction nets since it can be 
applied to untyped systems, thus all systems of interaction nets are captured. 



1 Introduction 

Interaction nets, introduced by Lafont [7], are graph rewriting systems that generalize the 
multiplicative proof nets of linear logic, and can be seen both as a high-level programming 
language or as a low-level implementation language. A program consists of a net (a graph 
built from a set of agents and wires) and a set of interaction rules that describe the way in 
which the net will he reduced. We are interested in the problem of defining an equivalence 
relation between programs that compute the same results, or in other words, that behave 
in the same way, in all contexts. In that case, one program can be replaced by the other, for 
example for efficiency reasons, without altering the operational semantics of the system. 
To define this equivalence relation we first need to develop an operational theory of 
interaction nets specifying in a precise way how programs are executed (i.e. a strategy 
of evaluation of nets and a notion of value). 

In [2] we proposed a way of adapting the coinductive techniques, used successfully 
for the functional and object-oriented programming paradigms, to give a notion of op- 
erational equivalence for the interaction paradigm. The language of interaction nets that 
was studied focussed on the notion of type, which is natural if interaction nets are seen 
as a programming paradigm. In particular, types allow us to distinguish values from pro- 
grams. However, some applications of interaction nets do not fit into the typed framework 
in a natural way. For instance, systems based on the interaction combinators [8], or the 
systems of interaction used for the encoding of the A-calculus [9] , are untyped. Although 
it is possible to develop a type system for them [6] , a natural approach would be to develop 
an operational theory of equivalence of interaction nets that does not rely on the notion 
of types. The same remark can be made in the case of functional languages based on the 
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A-calculus, where two different approaches can be found in the literature, depending on 
whether the calculus is typed (see for instance [10]) or untyped (see for instance [1]). 

In this paper we present an operational theory for untyped interaction nets, including 
a notion of contextual equivalence and an associated bisimilarity relation which permits 
the use of coinductive techniques in the proofs of operational equivalence. To express 
these notions we use the textual calculus of interaction nets presented in [3] instead of 
the graphical language, since it allows us to give a concise and formal presentation. We 
leave the use of diagrams for the examples and intuitive explanations. 

A system of interaction nets is a user-dehned language, in the same spirit as systems 
based on term-rewriting. Our results are applicable to any system of interaction nets; 
we are not restricted to one specific set of rules. If the system is typed, the information 
provided by types can be used to obtain a more refined equivalence relation between 
nets, recovering the results of [2]. We remark that interaction nets are also used as an 
object language for the coding of other rewriting systems. The A-calculus is perhaps the 
most studied example of this (see e.g. [4,9]). Our results are also applicable here, so we 
have a proof technique for optimizations of such systems. 

The paper is organized as follows. In the next section we set up the definition of 
interaction nets and define our evaluation strategy. Section 3 sets up to notion of bisim- 
ilarity. In Section 4 we give some examples of use of this relation. In Section 5 we 
formalize the notion of contextual equivalence, and Section 6 shows that this coincides 
with bisimilarity. Finally we conclude the paper in Section 7. 

2 Background: Interaction Nets 

We begin by presenting the textual calculus of interaction nets that we will use for the 
rest of the paper; we refer the reader to [3] for a more detailed description and examples. 

Let 27 be a set of symbols, called agents, ranged over by a, /3, . . ., each with a given 
arity, one principal port and a number of auxiliary ports equal to its arity. 27 can be 
partitioned into a set C of constructors and a set T> of destructors, depending on the 
application. Let TV be a disjoint set of names, ranged over by x, y, z, etc. Terms are 
defined by the grammar: t ::= x \ a{ti, . . . , f„), where a; G TV, a G 27, arity{a) = n 
and ti, . . . ,tn are terms, with the restriction that each name may appear at most twice. 
Af{t) denotes the set of names occurring in t. If a name occurs twice in a term, we say 
that it is bound, otherwise it is free. We write t for a list of terms t\, ... ,tn- Graphically, 
a term of the form a{t) can be seen as a tree with connections between its leaves: the 
principal port of a (indicated by an arrow) is at the root, and the terms G , . . . , are the 
subtrees connected to the auxiliary ports of a. A free variable represents a free port, and 
a bound variable represents a wire connecting two auxiliary ports. 




If t and u are terms, then the (unordered) pair f = u is an equation. A, 0, . . . will be 
used to range over multisets of equations. The graphical representation of an equation 
is a pair of trees connected by their roots (principal ports). 
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Interaction rules are pairs of terms written as a{t) txi /3 (m), where {a, (3) G is 
the active pair of the rule. All names occur exactly twice in a rule, and there is one rule 
for each pair of agents. 

Definition 1 (Configurations). A configuration is a pair: c = {TZ, {t \ A)), where TZ is 
a set of rules, t a list t\, ... ,tn of terms, and A a multiset of equations. Each variable 
occurs at most twice in c. If a name occurs once in c then it is free, otherwise it is bound. 
For simplicity we sometimes omit TZ when there is no ambiguity. We use c, c' to range 
over configurations. We call t the head and A the body of a configuration. 

Intuitively, {TZ,{t \ A)) represents a net that we evaluate using TZ. To draw the net we 
simply draw the trees for the terms in {t \ A), connect the common variables together, 
and connect the roots of the trees corresponding to the members of an equation together. 
The roots of the terms in the head of the configuration and the free names correspond to 
free ports in the interface of the net. Note that the head of the configuration may contain 
all or just some of the ports in the interface of the net, called observable. For this reason, 
the head is called the observable interface of the configuration. 

We work modulo a-equivalence for bound names as usual, but also for free names. 
Configurations that differ only on the names of the free variables are equivalent, since 
they represent the same net. 

Computation is performed by rewriting configurations using the following rewrite 
system, where if r is a rule, f denotes a fresh generic instance of r, that is, a copy of r 
where we introduce a new set of names: 

Indirection: If a; € Af{u), then x = t,u = v — ^ u[t/x] = v. 

Interaction: Ifr gTZ and r = . . . , cxi . . . , u'^), then 

. . . , . . . , Urn) )■ 

f 1 f 2 5 ■ • ■ 5 In Iny ) * ■ * ; tlm ^rn 

Context: If Z\ — ^ A', then {t \ E, A, E') — > {t \ E, A', E'). 

Collect: If a: € A/’(f), then {t \ x = u, A) — >• {t[u/x] \ A). 

This rewrite system generates an equational theory, the corresponding equivalence 
relation is denoted by c c'. The reduction relation — ^ is strongly confluent [3] since 

there is one rule for each pair of agents. Various strategies of evaluation are defined 
in [3]. The values that we use in this paper, called interface normal forms, have terms 
rooted by agents in the head whenever this is possible. 

Definition 2 (Interface Normal Form). A configuration {TZ, {t \ A)) is in interface 
normal form (INF) if each ti in t is of one of the following canonical forms.- 

- «(«)■ E.g. (S{x) \x = Z, A). 

- X where either x G fif{tj) for some j i, or x G Af{u) for some y = u G A such 
that y G N is free (x is in an open path). E.g. {x, x \ A) 

- X where x G Af (u) for some y = u G A such that y G Af{u) (x occurs in a cycle). 
E.g. {x I y = a{(3{y),x),A). 

We denote by INFi the set of configurations where the ith port in the head is canonical. 
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Computing interface normal forms suggests that we do the minimum work required 
to bring principal ports to the interface. This strategy is defined hy the inference rules: 

Axiom: 

ceINF 



Collect: 



c JJ. c 

(fl, I C 

{ti, \ X = t,A) i}, C 



Indirection: if a; G M{u) and y G Af{t, u = v) 



u[t/x] = u, Z\) c 
\x = t,u = v,A) 

Interaction: if a; G = (3{u)), r €TZ,r = a{t') cc /3(tt') 

>• >• 

(Si, . . . ,X, . . . ,Sn I t = t',u = u', A) JJ. c 
(si,...,x,...,Sn I a(t) = (3{u),A) 1| c 

This system is deterministic [3]. If c |1 v can be derived with these rules we say that v 
is the interface normal form of c. We write c JJ-i w (i.e. the position i in the head of v 
is canonical) if the rules Indirection and Interaction are only applied at position i in the 
head of the conhguration and the axiom is replaced by 



c G INF, 
cJJ-i c 



Example 1 (Combinators). The interaction combinators [8] are a universal system of 
interaction built from the 0-ary agent e and the binary agents 6 and 7, with the rules: 

5{x,y) 5{x,y) txi e 

7(x, y) txi 7(1/, a;) 7(e, e) t<i e 

e CXI e 5(7(0, &), 7(c, d)) cx 7(5(0, c), 5 ( 6 , d)) 

The configuration {x \ 'j{y,x) = S{e,y)) gives a non-terminating sequence of re- 
ductions, but has an interface normal form: ( 5 (o, 6) | e = 7(0, o), 5 (c, d) = 7(5, 6)). 



3 Bisimilarity 

In functional languages, we consider two functions equivalent when we can apply them 
to the same arguments and obtain the same results. In other words, we perform some 
form of experiment on the objects under test, and compare the results. For interaction 
nets, we take this general idea as inspiration. The way that we can make experiments 
with a net is to interact with it on a free principal port. Connecting nets on free principal 
ports is our analogue of applying a function to an argument. After evaluation, we can 
observe whether some principal ports are at the interface, which is analogous to observing 
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whether a A-term has evaluated to an abstraction. Only the observable ports (in the head 
of the configuration) are available for the experiments. Conhgurations with the same 
number of terms in the head will be said comparable. 

Two configurations that cannot be distinguished by any experiment will be called 
bisimilar. We will show that the bisimilarity relation can be dehned as the greatest 
postfixpoint of an operator (which allows us to use coinductive techniques to prove that 
two configurations are bisimilar), and more important, it coincides with the contextual 
equivalence, that is, two bisimilar conhgurations cannot be distinguished by any context 
and can therefore be exchanged without altering the behaviour of the system. We begin 
with some basic dehnitions to formalise these ideas. 

Definition 3 (Visible Interface). A configuration c = (fi, . . . , | A) € INFi has a 

visible interface at position i if either ti is not a variable or there is an open path starting at 
ti and finishing at some tj = a'{u) G t, thatis, ti = a{u) orti = x and there is an open 
path to tj = a'{u). The visible agent at position i is a in the first case, a' in the second 
case. The rest of the net is called thek&mel: ICi{c) = (fi, . . . , m, ffe+i, . . . , | A) 

where k = i or k = j depending on whether we are in the first or second case. The set 
of new observable positions in ICfic), denoted NPjc{c, i), is the set of positions of the 
terms u if k = i, otherwise it contains just the new position of ti. We denote by Vi the 
set of all the configurations with a visible interface at position i. 

If V and v' are comparable and have the same visible agent at position i, we write 
SVAi(v,v'). If the visible agents are different, but they are not both constructors, we 
write -'Constrfiv , v'). 

Example 2. Let c = {I{x),x \ ). Since ti = I{x) is not a variable, c G Vi. Since t 2 = x 
and there is an open path to ti = I{x), c G V 2 . The visible agent is I for both positions, 
and /Ci(c) = /C 2 (c) = {x,x\ ). 

When we have different agents in the visible interfaces of the nets under test, and 
they are not constructors, we need to see if these agents behave in the same way for each 
possible agent interacting with them. For this we use closings. 

Definition 4 (Closing). A closing at position i of a configuration c = (t | Z\) G Vi, 
denoted by clfic), is obtained from c by one of the following operations, where k = i, 
or k = j if there is an open path starting at position i and finishing at position j in c: 

1. replace tk = a{s) in t, by a list of new variables z\, . . . ,Zp G fif{u), p > 0, and 
add to A the equation tk = a'{u), where a' is any agent, and the terms in u are 
either new variables ( in which case they can appear twice in a'{u) or once in a' (u) 
and once in z) or elements oft, in which case they are erased from t. 

The set NPd{c,i) of new observable positions in cli(c) contains the positions of 
the variables z\, . . . , Zp in the new head ifi = k, otherwise it contains just the new 
position off. 

2. erase tk = o;(s) and another term tp in t and add tk = tp to A. In this case 
NPd{c, i) = 0 ifi = k, otherwise it contains just the new position off. 

By abuse of notation, we will denote byclfic') the result of applying to a configuration 
c' comparable with c the operations that define a closing at position ifor c. 
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Graphically, the first operation corresponds to connecting the principal port of an 
agent a' to the fcth observable port in the interface of the net, and connecting some 
auxiliary ports of a' between them (if a variable appears twice in it), or to other observable 
ports in the net (if u contains terms in t). The second operation corresponds to simply 
adding a wire connecting the observable ports k and p. 

We consider a complete lattice (Rel where Rel is the set of binary relations 
between pairs (c, i), (c', i) such that c, c' are comparable configurations whose heads 
have at least i elements (i.e. we can talk of the zth observable port). The operators {TZ) 
and [7^] for TZ G Rel will be used to define similarity and bisimilarity respectively. 

Definition 5 (Operators). Let c, c' be comparable configurations with at least i terms 
in the head. 

(c, i) (TZ) (c', i) <4=^ c ij-i V G Vi 3v', (c' JJ-i v' and 

either SVAi{v,v') andVp G NPjc{vfi), (ICi{v),p) TZ (ICi{v'),p) 
or ~'Constrfiv,v')ycli{v),yp G NPd{v,i), {ck{v),p) TZ {ck{v'),p)) 
{cfi)\lZ]{c! ,i) {c, i) (TZ) {c' , i) and {c' , i) (TZ) {c, i) 

Property 1. (•),[•] are monotone operators. 

Definition 6 (Similarity, Bisimilarity). 

- A relation S G Rel such that S C (S) ( i.e. S is a post-fixpoint of{-)) is a simulation. 
The greatest such S is called a similarity, and written as If c, c' are comparable 
configurations with n elements in the head, then if{c, i) {d , i), 1 < i < n. 

— ArelationB G Re\suchthatB C [B] (i.e. B is a post-fixpoint of [■]) is ahisimulsAion. 
The greatest such B is called a bisimilarity, and written as Ifc, d are comparable 
configurations with n elements in the head, then cCG.d if (c, i) ~ (c^, i), 1 < z < n. 

Note that (•) and [•] posses a greatest post-fixpoint by the Tarski-Knaster Fixed Point 
Theorem. Moreover, and ~ are fixed poinfs, i.e. = {ffi and ~ = [~] . 

Remark 1. The main difference with the typed approach resides in the definition of 
closings and the way they are used in the definition of the operators (•) and [•]. Here 
closings are applied “on demand” whereas they are a static notion in the typed framework. 
More precisely, in a typed net a closing is built just by connecting agents to all the free 
input ports. The Subject Reduction property ensures that reduction will not create new 
free input ports. Instead, here we close one principal port at a time, and since reduction 
might create a new free principal port, closings are applied in a dynamic way. 

The relations ~ can be defined by levels, as done by Abramsky for the untyped 
A-calculus [1]. 

Proposition 1 (Coinduction Principle). Let c, d be comparable configurations with n 
observable ports. To prove cCG.d it suffices to find a bisimulation B such that 
(c, i)B(d, i) for l<i<n. 

By coinduction we can show that the equational theory is included in the bisimilarity 
relation. In Section 4 we give more examples of application of coinduction to prove 
bisimilarity, in particular we will show that this inclusion is strict. 

Theorem 1 (Bisimilarity includes the equational theory), c o* d c~c'. 
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4 Examples 

The Identity agent and a wire. Let / be the identity agent defined by rules 

Xn)) CXI a{I{xi ), . . . , /(x„)) 

for any a G S. We can prove {I{x),x \ )~(a;, a; | ) by coinduction. Take a symmetric 
R containing the pairs ((c, i), (c', t)) such that c' is obtained from c by erasing the I 
agents at the root of a term in the head, or at the root of a member of an equation. We 
show that i? is a bisimulation: if c ij-j u G Vi, then d v' , and either they have the 
same visible agent a at position i, in which case the kernels are in the relation for all the 
new observable positions, or if they differ, then one is rooted by / and the other is just a 
variable. In that case the closings are in the relation, which is sufficient since / is not a 
constructor. 

Copying before erasing or just erasing. In the system of the interaction combinators, 
replacing a net of the form 




by the agent e seems an intuitive optimization. We can prove that they are bisimilar by 
coinduction. The Main Theorem 2 tells us then that these configurations are contextually 
equivalent — the optimization is correct. 

Agents 7 and 6. The following nets are bisimilar: 





To show it using the coinduction principle we consider a relation containing ~ and 
these pairs, for any closing of the free principal port. The interesting closings are built 
by adding an agent e, 7 , or 6 (the closings using just wires do not reduce to a value 
with a visible interface). The case of e is trivial. For the other cases, by reducing to 
interface normal form we obtain configurations that have the same visible agents and 
whose kernels are easily shown to be bisimilar, hence contained in our relation. 
rj-rulesfor 7 and 6. The following nets are bisimilar: 



7 



,5 




454 



M. Fernandez, I. Mackie 



Note that these last two equivalences are neither included in the equational the- 
ory (since the nets are different normal forms) nor provable using the path semantics 
developed by Lafont [8]. 

5 Contextual Equivalence 

We define a set of operations that build a context for a configuration, in the same way 
that closings were defined by operations. But there are more operations in the case of 
contexts, and we can have a sequence of operations instead of just one operation. 

Definition 7 {Context). A context at position i for a configuration c = (t | Z\) is defined 
by a (possibly empty) sequence of operations, where k = i, or k = j if there is an open 
path starting at position i and finishing at position j in c. Non-empty sequences (i.e. 
contexts) are defined inductively, there are three cases according to the first operation 
used. 

1. Addition of agent by principal port: This operation replaces tk in t by a list of new 
variables z\, . . . ,Zp € fif{u), and adds to A the equation tk = a{u), where a is 
any agent, and the terms in u are either new variables ( in which case they can occur 
twice in a{u) or once in a{u) and once in z) or elements oft, in which case they 
are erased from t. 

In this case the rest of the sequence is the concatenation of contexts at the positions 
of the variables z\, . . . , Zp in the new head and at the new position ofti if i ^ k. 

2. Addition of agent by auxiliary port: This operation replaces tk in t by a list of new 
variables z\, . . . ,Zp occurring free in y ^ a{u) and adds this equation to A, where 
a is any agent, and the terms in u are either new variables ( in which case they can 
occur twice in a{u) or once in a{u) and once in z) or elements oft, in which case 
they are erased from t. The term tk must occur in u. 

Also in this case the rest of the sequence is composed of contexts at the positions of 
the variables z\, . . . , Zp in the new head and at the new position ofU if i ^ k. 

3. Addition of a wire: erase tk and another term tp in t and add tk = tp to A. In this 
case the rest of the sequence is empty ifi = k, otherwise it is a context at the new 
position ofti. 

VFe denote by opij (c) the result of applying an operation as above to the configuration c 
at position i, using the positions j in t. We denote by Ci [c] the configuration resulting of 
applying the context C, defined by a sequence of operations as above, to the configuration 
c at position i, and by C(c, i) a generic context for c at position i. We will also denote 
by Ci[c'] the result of applying to a configuration c' comparable with c the operations 
that define a context at position i for c. 

The set NPc{c, i) of new observable positions ofCi[c] is computed as follows: we 
start with the set {i}, and compute a new set each time we perform an operation. The 
first and second operations add the positions of the variables Zi, . . . , Zp in the new head, 
and ifi = k they erase i, otherwise they replace i by the new position ofti in the head. 
The third operation simply erases the position ifrom the set ifi = k, otherwise replaces 
i by the new position ofti. 
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Graphically, the first two operations correspond to connecting an agent a to an 
observable port of the net (using the principal port of a in the first one, and an auxiliary 
port in the second one). The third operation corresponds to adding a wire connecting 
the observable ports k and p. Closings are particular cases of contexts defined by one 
operation of the first or third class. 

Definition 8 (Contextual Preorder and Contextual Equivalence). Let c, d be com- 
parable configurations with n elements in the head. 

c < c' <4^ Vi G [1 . . . n], (c, i) < (c'j i) 

(c,i) < (c',i) 44 VC(c,i),Vp G NPc{c,i),Cfic] u G Vp ^ 3u', ^pu' 

and either SVAp{v,v') or ->Constrp{v,v')) 

(c, i) = (c'j i) 44 (c, i) < (c', i) and (c', i) < (c, i) 

c — d 44 Vi G [1 . . . n], (c, i) = (c'j i) 

6 Main Result 

We will show that the notions of contextual equivalence and bisimilarity coincide, if the 
interaction net system has “enough contexts” to extract the kernels of all values. 

Definition 9. A system of interaction is complete if for any v € Vi with visible agent a at 
position i, there exists a context C“ such that dp G NPic{v, i), (ICi{v),p) ~ {C°‘[v],p) 
and p G NPq<^ (v, i). 

Theorem 2. If the interaction net system is complete, < ( resp. — ) coincides with 
( resp. ~ ). Otherwise ( resp. ~ ) is included in < ( resp. — ). 

Proof. To prove C < it is sufficient to show that is preserved by context. Following 
Howe [5] we prove that is a precongruence (a preorder preserved by context) using 
an auxiliary relation the precongruence candidate, defined as follows. 

Let c, d be comparable configurations with n elements in their heads. 

d 44 Vz G [1 . . . n], (c, z) (c', i) 

(c, z) (c', z) 44 either (c, z) (c', z), 

or c = oppj {d),i G NPop{d, p ) , 

{d,q) {d',q),dq G j and 

{oPp,j{d'),i) (c',z). 

The precongruence candidate enjoys the following properties. 

Property 2. 1. C 

2. ;^* is reflexive. 

3 . c:<*c',df^c" ^ c<*d 

4. is preserved by context: cf^*d Vz, VC(c, z), Cj[c];^*Cj[c']. 

To show that is a precongruence it is sufficient to prove that it coincides with 
for which it remains to prove Q This follows, by coinduction, from: 
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Proposition 2. 1. v £ Vi, {v, i) {o' , i) {v, i) {^*) {o' , i) 

2. c'<*c' , c ij-i V £Vi ^ v^*c' 



This concludes the proof of the first inclusion: C <. Now we prove <C by 

coinduction, showing < C (<). Assume (c, z) < (c',z). By definition of <, using an 
empty context, c JJ-j z; G Vi 3v' , d JJ-j v' and either SVAi{v, v') or ^Constriiy , v'). 
In the latter case we are done, since closings are particular cases of contexts. In the 
first case, we know by completeness that (ICi{v),p) ~ (Cf[z;],p), Vp G NPjc{v,i). 
Moreover, since bisimilarity includes the equational theory (Theorem 1), and (c, z) < 
(c',z): (C^[v],p) ~ (C°‘[c],p) < (C“[c'],p) ~ (C^[v'],p). Again by completeness 
(since SVAi{v, v')), {Cf\v'],p) ~ (ICi{v'),p). Since we have already proved ~ C =, we 
get (ICi{v),p) < {K.i{v'),p),\/p G NPx,{v,i) as required. □ 

7 Conclusion 

In this paper we have presented a notion of bisimilarity for (untyped) interaction nets. 
This notion has been shown to coincide with the contextual equivalence, thus we have 
a simple proof technique for showing when two nets are equivalent in all contexts. 
One of the main applications that we see for this work are general correctness proofs 
for optimizations in interaction net implementations of various systems, such as the 
A-calculus or term rewriting systems. 
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Abstract. For words of length n, generated by independent geometric 
random variables, we consider the mean and variance, and thereafter 
the distribution of the number of runs of equal letters in the words. In 
addition, we consider the mean length of a run as well as the length of 
the longest run over all words of length n. 



1 Introduction 

Let X denote a geometrically distributed random variable, i. e. P{A = k} = 
pqk-i fQj. ^ g and q = 1—p- The combinatorics of n geometrically distributed 
independent random variables Xi,... ,X„ has attracted recent interest, espe- 
cially because of applications in computer science. We mention just two areas, 
the skip list [1,13,15,8] and probabilistic counting [3, 6, 7, 9]. 
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In [14] the number of left-to-right maxima was investigated for words 
oi . . . a„, where the letters at are independently generated according to the geo- 
metric distribution. In [10] the study of left-to-right maxima was continued, but 
now the parameters studied were the mean value and mean position of the r-th 
maximum. 

In this article we study runs of consecutive equal letters in a string of 
n geometrically distributed independent random letters. For example in u; = 
22211114431 we have 5 runs of equal letters of respective lengths 3,4,2, 1, 1. In 
the sequel we denote by the number of runs in the word w, where w 

is of length n. Run statistics play a significant role in the behaviour of sorting 
algorithms, as explained at length in [12]. 

In section 2 we study the mean and variance of Rn{w) ■ Thereafter, in section 3 
we study the distribution of the number of runs, which turns out to be Gaussian. 
Subsequently, in section 4 we study the average length of the runs per word. 
Finally, in section 5 we determine the mean and variance of the length of the 
longest run in a word of length n. 



2 Moments of Number of Runs 



In order to determine the mean and variance of the number of runs we will make 
use of the following decomposition of the set of all (non-empty) words. Here 
{> k} denotes the set {k,k + 1, . . for a given set A we denote 

OO 

=\J A* =eU A~^, 



where e stands for the empty word. We decompose the set of non-empty words 
according to runs of I’s, separated by words consisting of larger digits only 

{>!}+ = (£+ !+)({> 2 }+ 1 +)*{> 2 }+(£ +!+) + !+ ; ( 1 ) 



here we find it more convenient to write -I- instead of U. 

We consider a probability generating function F(z,u), where 2 labels the 
length of the word, and u counts the number of runs. We should always have 
F{z,l) = and a replacement of z hy qz, if we increase all letters by 1. 
Then (1) translates into the functional equation 



F{z,u) 



F{qz,u) 



1-F{qz,u) 



pzu 
1 — pz 



pzu 
1 — pz 




pzu 

1— pz ' 



(2) 



Now we differentiate it w. r. t. u, plug in u = 1, set G{z) = ^F{z, 1), and 
get 



G{z) 



G{qz) 



{l-qzf 

{l-zY 



pz{l — pz) 
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Setting H{z) = (1 — z)^G(z) yields 

H(z) =ff(qz)+pz(l-pz) . 
Comparing coefficients, we see that 



[z]H{z) = 1 , 



[z^]H{z) = - 



P 



P 



1 - l + q 

and that the other coefficients are zero. Consequently, 



H{z) = z--^z^ 
1 + q 



and 



G{z) = 



Z P -^2 

^ 1+g^ 



(1-Z)2 • 

This leads to 

Proposition 1. The mean value of the number of runs for n > 1 is given by 

Pn = = [z'^]G{z) = + y^. 

l+q l+q 

The computation of the variance is rather lengthier and requires that we 
differentiate (2) twice. This leads after some work to 

Proposition 2. The variance of the number of runs is given for n>2 by 



al = YRn = 



2q{l - q)\2 + q^)^ _ 2q{l - qf{3 - q + q^) 



(1 + g)2(l - q3) 



(1 + g)2(l - q3) 



3 Distribution of the Number of Runs 



In this section we discuss a central limit theorem for the distribution of the 
number of runs. In order to derive this, we have to extract further information 
from the functional equation (2). We observe that the terms on the right-hand 
side are all simple rational functions, except for the terms containing F{qz,u). 
By investigating the analytic properties of F{z,u) it can be shown that F{z,u) 
can be written as 

F{z, u) = + R{z, u), (3) 

1 - f{u)z 

where g(z,u) and R{z,u) are holomorphic in |z| < 1 -b i5, |m — 1| < 5 for some 
(5 > 0. Now we are in the general framework of Hwang’s quasi-power theorem 
(cf. [5]) and can deduce the following theorem. 

Theorem 1. The number of runs in words of length n produced by independent 
geometric random variables obeys a central limit law, more precisely 



Rn{w) < 



2g 

l + q 



n + t\ 



I2q{2 + q^)l-q 
l-q^ l + q' 



^{t) + 0{n-i). (4) 
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4 Average Length of Runs 



Given a string w of geometric random variables of length n with Rn{w) = k runs 
we define the average length of a run to be Ln{w) = It is of interest to 

determine the moments and the distribution of this parameter over all strings of 
length n. Intuitively, one expects that the mean length of a run should be close 
to n divided by the mean number of runs, which is 



n 



p 

1+9 



1+9' 



I + g 

2q 



1 - 

n 




In fact we obtain 

Proposition 3. For n > 1 the mean and variance of Ln{w) are given respec- 
tively by 



I + g 

2q 



oih 

n 



(l-g2)2(2 + g2) i 1 

8g3(l-(z3) n 



Moreover, Ln{w) obeys a central limit theorem: 



P Ln{w) 



I + g ^ (l-g^)\/2 + g2 t \ 
2g - VV(1 - g3) Vn) 



^{t)+0{n-i). 



The proof makes use of the distribution obtained for the number of runs in 
Theorem 1. 



5 Longest Runs 



In this section we study the mean of the longest run Mn{w) of equal digits in 
a string of length n. For this purpose we introduce the probability generating 
function Gh{z) of all strings that have runs only of length less than h. Similar 
arguments as in the proof of (2) show that Gh satisfies 



Gh{z) 



Gh{qz) 1 - {pzY~^ 

l-pz ) 1 - Gh{qz) (1 - {pzY-^) l-pz 



( 5 ) 



In order to extract the asymptotic behaviour of the probability that a string of 
length n has runs of length at most h, we have to find the singularities of Gh{z). 
Using bootstrapping we estimate ph, the dominant singularity of the function 

Gh- 

Combining this with estimates for Gh leads to 



P (M„(u;) <h) = {l- pq^Y + 0{hqY- 



(6) 
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Using (6) and Abel summation we then find that the first and second moment 
of the longest run are given by 

EM^{w) = ^ (1 - P {M^{w) < /i) ) = ^ (1 - (1 - + 0(1), 

h>l h>l 

EM„(w) 2 = 2 ^ /i(l - P (M„(u>) <h))~ EM„(w) 

h>l 

= Y^{ 2 h-l){l-{l-pq’^r)+ 0 {l). 

h>l 

In order to compute the asymptotic behaviour of these two moments, we use 
the now classical exponential approximation technique (cf. [12]). Thereafter we 
make use of the Mellin transform and Mellin inversion formula to obtain finally 

Proposition 4. The mean value of the length of the longest run Mn{w) in a 
string of n geometric random variables satisfies 

EMn{w) = logi n + 0(1). 

g 

Similarly, we could obtain an expression for the second moment 

EM„(u;) 2 + 0(1) = login + 0(logn). (8) 
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Abstract. Expressions for the generalized covariances of multi-dimen- 
sional Brownian excursion local times are derived from corresponding 
densities transforms. Typical applications are moments of the cost of 
structures such as M/G/1 queue. Random trees, Markov stack or priority 
queue in Knuth’s model. Brownian excursion area and a result of Biane 
and Yor are also revisited. 



1 Introduction 



Throughout this paper, the standard Brownian motion (BM) will be denoted by 
x{t). 

Fix t > 0 and denote the last zero of x before t and the first zero of x after 
thy 



G{t) := sup{s : X <t] x{s) = 0} 



and 

D{t) := inf{s : s > t; x(s) = 0}. 

The processes restricted to [G(t), t] and [G(t),B(t)] are called the meandering 
process ending at t : Z(u) := x~^{G{t) + u),0 < u < L~{t) := t — G{t) and the 
excursion process straddling t : 

Y{u) := x^{G{t) + u),0 < u < L{t) := D{t) — G{t), respectively. The standard 

scale excursion (BE) is X{u) := [Y{u)\L = 1]; note that Y{u) = '/iXiujt) when 
L = 1. The distributions of G and L are well known: see Chung [2, Theorem 1]. 
The local time of x{t) at a, denoted by 



1 f* 

t+(t, a) = lini - / I[a,a+e]{x{t))dt, 

<=-*■0 ^ Jo 



and the local time of the standard scaled excursion X at a, denoted by T+(a), 
have been studied by several authors (note that for an excursion of length £ we 

have: t+(£, a) = V£t~^ { a I V£) ■ See for instance Getoor and Sharpe [9], Knight 
[13], Cohen and Hooghiemstra [3], Hooghiemstra [12], Drmota and Gittenberger 
[4], Louchard [14], Gittenberger and Louchard [11]. Intuitively, the local time at 
a is the total time spent by the excursion in the neighbourhood of a. 
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Applications of the BE are numerous: we will mention a few of them, empha- 
sizing the meaning of the local time. For instance, consider a M/G/1 queuing 
system. There the customers arrive according to Poisson process (tt^, t > 0) with 
rate a~^ where a > 0. Denote the arriving time of the n-th customer by 
and the service time by which is assumed to be independent of the arrival 
process TTt- Then the actual waiting time process is defined by wi := 0,Wn+i '■= 
max{0, Wn + Sn — (tn+i ~ tn)} and the virtual waiting time process by 

vt := max{0, -I- s,rt ~ {t - U^)},t > 0. 



Furthermore, denote the length of the first busy period by £. Then Cohen and 
Hooghiemstra [3] have shown that for arbitrary 5 > 0 the following limit theorem 
holds: 




<£<s-|-(5),0<t6<l X{u), 



s — >■ oo. 



In this context the BE local time process appears as the weak limit of the (suit- 
ably normalized) number of downcrossings of the virtual waiting time process, 
i.e. 



d{v) = '■ 0 < t < £,Vt = u}; 



(#A denotes the cardinality of A) conditioned on the number of customers 
served during the first busy period (see [3, Sec. 7]). Another BE application is 
the number of nodes at some level in a random tree. Consider a simply generated 
random tree (according to the notion of Meir and Moon [19] or, equivalently, 
the family tree of a Galton- Watson branching process conditioned on the total 
progeny. Then BE appears as the weak limit of the contour process of this tree, 
i.e. the process constructed of the distances of the nodes from the root when 
traversing the tree (for details see Gittenberger [10]). The local time corresponds 
here to the number of nodes at some level. The generation sizes of the branching 
processes converge weakly to BE local time. The external path length (EPL) of 
a random tree is given by the sum of distances from the root to the leafs. 

Dynamical algorithms are also related the BE. The Stack structure of length 
2n (see Flajolet [6] p. 126) is asymptotically equivalent to a BE (Louchard [15]). 
The priority queue in Knuth’s model is combinatorially equivalent to a Markov 
Stack (see Louchard et al [17]). So the distribution of the size of this structure 
is asymptotically related, after suitable normalization, to the BE local time. The 
local time corresponds to the time spent by the structure at some level. 

The cost G of structures such as MjGjX queue busy period. Random tree, 
Markov stack or priority queue in Knuth’s model is asymptotically given, for any 
cost function g(-), by G = fg g[X{u)]du. For stacks and priority queue, the cost 
is related to the size. For the MjGjX queue, the cost is related to the waiting 
time. For EPL, the cost is related to the distance to the leafs. 

Moments of G are immediately related to the local time: we have 



dx\ 



E[G‘^] = d\ 



dx2 ■ ■ 



dxdg{xi)g{x2) ■ -gixd) ■ ■K{xi,X2, ■ ■ Xd) 




Generalized Covariances 



465 



with K{xi,X 2 ,‘ ■ Xd) '■= E[t'^ {x\)t'^ { x 2 ) ■ ■T'^{xd)] denoting the generalized co- 
variances. In this paper we obtain explicit expressions for K{xi,X 2 ■ -Xd)- We 
revisit also two classical examples: the BE area {g{x) = x) related to the Airy 
distribution (which has a lot of applications in combinatorics and data struc- 
tures) and a result of Biane and Yor [1] related to g{x) = 1/x. 

The paper is organized as follows. Sec. 2 gives the basic formula’s we need 
in the sequel, Sec. 3 provides an efficient algorithm for the generalized covari- 
ances computation. In Sec. 4, we consider two typical applications: the Brownian 
excursion area and the Biane and Yor formula. 



2 Basic Formulas 

In this section, we start from known results to derive expressions for the first 
generalized covariances K{x\,X 2 , ■■,Xd),d = 1 • -4. 

In [II] we obtained the following result depending on some Laplace trans- 
forms. 



E[e ^ ^ f e^e{d)da,Xd > Xd-i ■ ■ > xi 

\/2ni Js 

where S := [a — ioo, a + zoo], a > 0, rrix ■= inf{5 : X(s) = x}, 

^ 2 [F,{d)npd + c,{d) + C2{d)D2{d)/F,{d)] 

with some functions depending only on a and x.: 



Ci{d) = \j'^E{d, d - l)/Sh{d, d-l) 

^ ~2Sh'^{d,d-l) 

r (d) = Sh{d,d-2) 

^ V 2 Sh{d, d - l)Sh{d - 1, d - 2) 

Ci{d) = C2{d - 1) 

C^id) = V2Sh{d,d-l) 

and some functions depending also on /?.: 



Fi{d) = Pd-Md) + Di{d) 

Z?2(d) = Pd-2D4{d) + D^{d) 

= C^{d) + C4{d)D4{d)/D2{d) 

D^{d) = C^{d)D4{d-l) 

D4{d) = C^{d)D2{d-l) 
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E{e,m) := 

Sh{i, m) := sinh[-y^(x^ — Xm)], Sh{i) := sinh(-\Aa;^) 

•\A := V^- 



Initialisations are given by 

L»i(2) = ^Sh{2)/V2 
D2{2) = Sh{l)Sh{2,l) 



^From (1), it is possible (with MAPLE) to derive explicit expressions for suc- 
cessive derivaties of 0{d). For instance, with E{i) := 

= 2E(1)-2 

Pl = 0 



K{a,xi) = 



80 



8f3i 



K{a, xi,X2) = 



8^0 

882881 



= ^E{2)-\E{l)-^-l) 

01,132=0 V« 



(2) 



K{a,Xi,X2,X3) = 

J3E(1)-^E(2)-^ 



8^0 

883882881 



/3i,/32,/33=0 
-2^ 



-E(2)-2 -2E(1)-2)(E(1) 



^-2 



E{1) 



-2^ 



1)E(3)-2 



(3) 



K{a, Xi,X 2 ,X 3 ,X 4 ) = 



80 



8 8i8 83d 828 81 



/3i./32,/33,/34=0 



-—E{E)-\E{1)-^-1)- 

■{<6E{l)-^E{2)-^E{‘i)~^ - 6E(1)-2f;(3)-2£:(2)-2 + 2E{1)~'^ E{2)~'^ + E{2)~‘^ 
-hE(l)-2^;(3)-2 _ 3 ^;(i)-2£;(2)-4 + 2E{‘i)~'^ E{2)~^ - 3E(2)-^E(3)"2)/ 

(E(2)-2E(l)-2a3/2) 



High order derivatives become difficult to compute, even with MAPLE. So an- 
other technique is obviously needed. 



3 An Efficient Algorithm for Generalized Covariances 
Computation 

In this section, we first derive a recurrence equation for some functions arising 
in the generalized covariances. This leads to some differential equations for re- 
lated exponential generating functions. A simple matrix representation is finally 
obtained and it remains to invert the Laplace transforms. 
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3.1 A Recurrence Equation 

Let us first differentiate 0 w.r.t. /3d (each time, after differentiation w.r.t. /3j, we 
set (3i = 0). This gives 

—a‘^ 

2[CiFi + C2D2V 

We should write Ci{d), etc. but we drop the d— dependency to ease notations. 
Differentiating now w.r.t. Pd-i, this leads to 



a<^CiD2 a<^Ci[Pd-2D4 + D3] 

[C 4 D 4 + [Cyf3d-2Di + + CiCiDi]^ 

with C'/{d) := Ci{d)C 3 {d) + C 2 {d) = C'i(d)C'i(d— 1) after detailed computation. 
It is clear that the next differentiations will lead to some pattern. Indeed, set 



H{d, i) := 



d<i 



-2 



D2{dy 



d(3d-2 ■ -dfdi [Ci{d)Dyd) + C2{d)D2{d)] 



2-|-2 I 



(4) 



l/5d-2"/3i=o 



obviously 



d'^0 



df3d ■ -dPi 



= C'i(d)a^(-l)‘^iL(d,l) 



(5) 



0d-0l=O 



Expanding (4), we derive (omitting the details) 



Ciidy+^Ciid-iy-wydy' 

• E (* ■ M {- iy -^-^ C 2{ d - iy -^- y -2 H{d -i,z- j ) 
i=o ^ 

+ (i + 2)C2 (c? — l)iL(d — 1, i — j + 1)] (6) 

(6) is still too complicated. So we set first Hi{d, i) := H{d, i)C 2 {d,y~^. This leads 
to 



H (d 3= 

^ Ci{dy+^Ci{d - ly-w^idy 

• E f * ■ M {-iy-^-y-2Hyd -l,i-j) + {^ + 2)Hyd -1,1-3 + 1)] 
j=o ^ 2 

But we remark that c+d)cf(d-i) = 

C^{d)-.= E{dy -E{d-lf (7) 

Then, we set H 2 {d,i) := Hi{d,i)Ce{d,y~^ and we obtain 

^ cydyc^idy' 
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• ^ r . M {-iyc^{d-iy[-2H2{d-i,i-j) 
i=o ^ ^ 



3.2 Some Generating Function 

Eq. (8) is a perfect candidate for an exponential generating function (see Flajolet 
and Sedge wick [ 5 ]). We set 



OO 

^ 2 {d,v) := ^ 

1 



H 2 {d,i)v" ^ 



(8) leads to 

(P 2 {d,v) = 



1 

cydycydy 



- 2 .Mi -!,«)+ .] 



,-«Ce(d-l) 



1 






C^id-l) 

With ( 7 ), we are led to set 

if3{d,v) := ip2{d,v)e"^^'^~^'^ and H{d,l) = (p3{d,0) 

Before establishing the corresponding equation for (p3, it is now time to find the 
effect of all our transforms on (p2- Indeed, H{ 2 ,i) = 7^3”^ (see ( 4 ) with 

(5i = D2{2),S2 = Cy2)Di{2) + C2{2)D2{2) 

_ 61 _ -2E(2)-2(E(1)-2 - E(2)-2)(E(1)-2 - 1) 

(if “ E(l)-2a3 



(is = 



So 



Hi{2,i) = 7 ( 54 ~\ with ii4 = 6302(2) 

772(2,1) = 7 <i 5 “\ with 65 = 64Ce(2) 

tp2(2,v) = je'"^y(p3(2,v) = 7 e"'^®, with , 5 g = (ig + E(l)^ = 1 , 

after all computations. 

We see that it is convenient to finally set 

(f3(d, v) := ip4(d, u)e“, ip4(2, v) = 7, H(d, 1 ) = ip4(d, 0 ) 
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The differential equation for (^4 is computed as follows (we omit the details) 

_ 2{E{d-l)-^-E{d)-^)E{d)-^ 

’ E{d-l)-^{E{d- 2 )-^ - E(d-l)-^)^/^^ 

92 d 

HlV-^(fi4{d - 1, f ) + (/X 2 + fi 3 v)—(p 4 {d - 1, v) 



+ (pt4 + /J,5v)lfi4(d - 1, u)] 



(9) 



with 



Ail := E(d-2)-^E(d- 1)-^ 

Ai2 := 3E(d-2)-^E(d- 1)-^ 

^i3 ■■= ‘2E{d - 2)-‘^E{d - l)-2 _ E{d - 2)-2 - E{d - 1)-^ 

Ai4 := -2E{d - 2)-2 - E{d - l)-^ + 2,E{d - 2)~'^E{d - 1)-^ 

H 5 ■■= E{d - 2)-'^E{d - 1)"^ - E{d - 1)"2 - E{d - 2)-2 + 1 . 

3.3 A Matrix Representation 

It is now clear that (p 4 {d,v) is made of 2 parts: the first one is given by the 
product of 7 with all coefficients in front of (9). 

The other part is given, for each d, by a polynomial in v, the coefficients of 
which are given by the following algorithm. 

Start with uec 2 [ 0 ] = l,uec 2 [i] = 0,i > 1. 

Construct a tri-diagonal band matrix Ad as follows: if we apply the differential 
operator of (9) i.e. [Ati^^g^ + (M 2 + M 3 v)^ + (M 4 + M 5 )w] to a polynomial X)o 

we see that the new polynomial diV^ is given by 
do = Ac([0, l]ai -l- Ad[0, 0]ao 

di = Ad[i, i + l]ai+i -I- Ad[i, i]ai + Ad[i, i - i > 1 



with 



Ad[i, i+l]:= [i{i -f 1) -f 3(i -f l)]^;(d - l)~^E{d - 2)~^ 

Ad[i, i] := [{2i + 3)E{d - l)~^E{d - 2)~^ - {i + 2)E{d - 2)~^ 

-(i+l)E(d-l)-^] 

Ad[i, i - 1] := [E{d - l)~'^E{d - 2)-^ - E{d - - E{d - 2)~^ + 1] (10) 

All other elements of Ad are set to 0. 

Successive applications of Ai to vec 2 give the coefficients of the polynomial 
part of ifi 4 {d, v) : 

d 

vecd := n vec2 

i=z 



( 11 ) 
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Now, by (5), = Ci{d)a'^{-lYipi{d,Ci) = Cs{d) vecd[0], where 

Cs{d), after all simplification, is given, with (9) by 



Cs{d) 



-4{-1Y E{d)-YE{l)-^-l) 

^d-i E(X)-’^--E{d-2)-^' - 



(12) 



Let us summarize our results in the following theorem 

Theorem 1. K{a,xi,X 2 ■ -Xd) = I lgd-/3i=o = C's(c^) vecd[Q] where Cs{d) 

is given by (12), vecd = Ilfca veci, with uec 2 [ 0 ] = 1, vec 2 [i] = 0,i > 1 and 
the hand matrix Ai is given by (10) 

The computation of our covariances is now trivial. 



3.4 Inverting the Laplace Transforms 

It is well known that £q,[/(m)] = with f{u) = ^a[g{u)] = 

with g(u) = ^ J— . Hence, from (2), 

E[t+{xi)t+{x2)] = - e~^^-],X2 > xi (13) 

We recover immediately Cohen and Hooghiemstra [3], (6.15) 

Similarly, from (3), 

L;[T+(a;i)T+(a;2)T+(a;3)] 

= aJ {3[e-2["i+"=+"3]Vt(^^ + ^2 + xs) - e-"["^+"^l'/*(:E2 + a^s)] 

_jg-2p,+.3lVt(^2 + Xs) - e-2["3+^2-xi]"(3.2 + X3 - Xi)] 

_^[e-2[.,+.,f/t^x, + X3) - e-2[-3lV*(a,3)] J ^ (14) 

Next covariances lead to similar expressions, with multiple integrals on t. 



4 Some Applications 

In this section, we apply the generalized covariances to two classical problems: 
the Brownian Excursion Area and a result of Biane and Yor related to the cost 
function g(u) = 1/x. We can proceed either from the Laplace transforms or from 
the explicit covariances. 

4.1 Brownian Excursion Area 

In [14], [16] we proved that Wd ■= E[J^ X{u)duY satisfies the following recur- 
rence equation. Let jk '■= (36-\/2)^WfeT(^^^)/2-y7r. 
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Then 

k^l ^ ^ 

with Lpk '■= r{3k + \)/r{k + \). 

The corresponding distribution, also called the Airy distribution has been 
the object of recent renewed interest (see Spencer [20], Flajolet et al [8], where 
many examples are given). 

(15) leads to Wi = V^j4,W2 = 5/12, W3 = \/2)/l5/128... From (2) we 
compute the Laplace transforms 

Inverting, this leads to 5/12 as expected. 

Similarly with G(a) given by (3), 3! xidxi X 2 ctx 2 xsdxsGla) = 
i 2 Sa^ ■ Inverting, this leads to -\/27rl5/128 as expected. 

An interesting question is how to derive the recurrence (15) from the matrix 
representation given by Theorem 1. 



4.2 A Formula of Biane and Yor 

In [1], Biane and Yor proved that 

Jo [ 0 , 1 ] 

With our techniques, we prove in the full report [18] that all moments of both 
sides are equal. 



5 Conclusion 

We have constructed a simple and efficient algorithm to compute the generalized 
covariances K{x\,X 2 ,- ■ Xd)- Another challenge would be to derive the cross- 
moments of any order: 

K{xi,ii,X2G2 ■ -Xdyid) = A[r+(xi)*V+(a;2)*^ • ■T~^{xdY'^] 

It appears that the first two derivatives 

d^d;^;_\0y.=p...=o 

lead to a linear combination of terms of type 



D2{dYD^{dY/{Gi{d)Di{d) + C2{d)D2{d)r 
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The next derivative after all simplifications (and setting (3d-2 = 0) lead 

to terms of type 

H{d-l,s,t) := 

and this pattern appears in all successive derivatives. 

So, we can, in principle, construct (complicated) recurrence equations for 
H{-,s,t) and recover our K by linear combinations. This is quide tedious and 
up to now, we couldn’t obtain such a simple generating function as in Sec. 3.2 . 
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Abstract. For n independently distributed geometric random variables 
we consider the average length of the m-th run, for fixed m and n — >■ oo. 
One particular result is that this parameter approaches 1 + q. 

In the limiting case g — >■ 1 we thus rederive known results about runs in 
permutations. 



1 Introduction 

Knuth in [6] has considered the average length Li~ of the fcth ascending run in 
random permutations of n elements (for simplicity, mostly the instance n — >■ oo 
was discussed). 

This parameter has an important impact on the behaviour of several sorting 
algorithms. 

Let X denote a geometrically distributed random variable, i. e. P{A = k} = 
for fc G N and q = 1 — p. 

In a series of papers we have dealt with the combinatorics of geometric ran- 
dom variables, and it turned out that in the limiting case g — >■ 1 the results 
(when they made sense) where the same as in the instance of permutations. 
Therefore we study the concept of ascending runs in this setting. We are con- 
sidering infinite words, with letters 1, 2, • • • , and they appear with probabilities 
p,pq,pq^ , ■ ■ ■ . If we decompose a word into ascending runs 



a\ < ■ ■ ■ < ar > bi < ■ ■ ■ < bs > c\ < ■ ■ ■ < Ct > ■ ■ ■ , 

then r is the length of the first, s of the second, t of the third run, and so on. 
We are interested in the averages of these parameters. 

* This research was partially conducted while the author was a guest of the projet Algo 
at INRIA, Rocquencourt. The funding came from the Austrian-French “Amadee” 
cooperation. 



G. Gormet, D. Panario, and A. Viola (Eds.): LATIN 2000, LNCS 1776, pp. 473—482, 2000. 
(c) Springer- Verlag Berlin Heidelberg 2000 
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2 Words with Exactly m Runs 



As a preparation, we consider the probability that a random word of length n 
has m ascending runs. For m = 1, this is given by 

[z'^]Y[{l+pq^-^z) forn>l. 

i>i 



But the product involved here is well known in the theory of partitions; the 
usual notation is 

(a)„ = (a; q)n = (1 - a)(l - ag)(l - aq"^) • • • (1 - and 

(a)oo = (a; q)oo = (1 - a)(l - aq){l - aq^) ■■■ . 

Therefore 



J]^(l +M* ^z) = {-pz)oo = ^ 



i>l 



n>0 



pn^{l)zn 



the last equality being the celebrated identity of Euler [2]. This was already 
noted in [7]. If we set 



E [Pr. that a word of length n has (exactly) m ascending runsjz’’ 



n>0 



then Ao{z) = 1 and Ai(z) = {—pz)oo — 1- 

Now for general m we should consider {—pz)'^. Indeed, words with exactly 
m ascending runs have a unique representation in this product. However, this 
product contains also words with less than m runs, and we have to subtract 
that. 

A word with m — 1 ascending runs is n + 1 times as often contained as in 
Am-i{z). This is so because we can choose any gap between two letters (also on 
the border) in n + 1 ways. Such a gap means that we deliberately cut a run into 
pieces. Then, however, everything is unique. In terms of generating functions, 
this is D{zA^_i{z)). (We write D = ^.) For m — 2 ascending runs, we can 
select 2 gaps in (”J^) ways, which amounts to ^D{z‘^Am- 2 {z)), and so on. 

Therefore we have the following recurrence: 
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here are the first few values. We use the abbreviations P = {—pz)ao and Pk = 
zkp^P. 



^0 — 1) 

yli = P - 1 

d2 = (P-l)P-Pl, 

yl3 = (P - l)p2 + Pi - 2PPi + ip2, 

Ai = {P- l)p3 + 2PPi - ip 2 - 3p2Pi + p2 + PP 2 - ^Pg. 



In the limiting case g — >■ 1 we can specify these quantities explicitly. This 
was obtained by experiments after a few keystrokes with trusty Maple. — Instead 
of P we just have e^, and that definitely makes life much easier, since all the 
derivatives are still P. 

Theorem 1. The sequence is defined as follows, 

m I 

:= ^ -D'^iz’^A^.kiz)), Ao{z) := 1. 






Then we have for m > 1 



^m(z) =J2' 









3=0 



Proof. First notice that if we write 



Am{z) — I 



3=0 



(to - j)! 



(to-j)! 



for TO > 0, then Am{z) = Am(z) — Am-i(z). And the equivalent formula is 

771/ 771/ 

k^O j=l 

which we will prove by induction, the basis being trivial (as usual). Now 



— k 






(_j\m-k-j 
,jz m-j V J ) 






i/—k 



Z 






= IlsS i 

k—0 z=0 ' j — 1 



(m — k — j)l 



(m-j)! (-j)™ ^ 



(to — j — if. (to — k — j)\ ’ 
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and we have to prove that the coefficient of therein is 1. Writing M := m — j 
it is 



M k 

binomkij 

k—0 i—0 



k-i M-i (~i) 

Z 



M-k 



= E 



M M 

^ ... 



{M-i)\ {M-k)\ 



^ 1 (k\ (-1)^-^ 
^ fc! {M-k)\ 



M 



= E 



2=0 

M 



2=0 



M 



(i^) 



M-2 



(M-z)! ^ V 

k—i 



^ETEk-d 



M-fc 



f-2 (~1) 



M-i 



= E ( 'E = 1. 



{M-i)\ 



Thus 



E = Y: ^,D\z^\^.,{z)) - Y ^,D\z^A^.,.,{z)) 



fc=l 



fc = l 



A:! 



fc=i 



/c! 



m— 1 



= ^ - A™(z) - ^ + A^_i(z) = - 7l„(z), 

j=i i=i 



and the result follows. □ 

Remark. Since every word has some number of ascending runs, we must have 
that 



E 

m>0 



1 



\-z' 



In the limiting case g — >■ 1 we will give an independent proof. For the general 
case, see the next sections. Consider 

Ao(z) + 7li(z) + • • • + Am{z) = Xm{z); 

for TO = 6 we get e. g. 

\ rrl — 1 4- r 4- r2 _l_ ^3 I 4 I 5 I 6 I 5039 _7 , 5009 .8 I 38641 ^9 , TO f 
Aq[^Z) — 1 + Z + Z +Z +Z +Z +Z + 5 ^ 2 : + 5040^ + 40320^ + U (Z ). 



Now this is no coincidence since we will prove that 

[z’^]Xm(z) = 1 for n < TO. 
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This amounts to proving that^ 



1 

n! 







Notice that 







are Stirling subset numbers, and they are zero for h < n. Therefore, upon ex- 
panding (m — k)^ by the binomial theorem, almost all the terms are annihilated, 
only (— fc)" survives. But then it is a Stirling number {"} = 1. It is not too 
hard to turn this proof involving “discrete convergence” into one with “ordinary 
convergence.” □ 



3 The Average Length of the mth Run 



We consider the parameter “combined lengths” of the first m runs in infinite 
strings and its probability generating function. 

Note carefully hat now the elements of the probability space are infinite 
words, as opposed to the previous chapter where we had words of length n. This 
is not unlike the situation in the paper [4]. 

To say that this parameter is larger or equal to n is the same as to say that 
a word of length n (the first n letters of the infinite word) has < m ascending 
runs. Therefore the sought probability generating function is 

Fm{z) = (^o(^) + • • • + H ■ 

Now 

^m(l) = ^o(l) + • • • + — 1 = 7li(l) -b • • • -b 7lm(l) 

is the expected value of the combined lengths of the first m runs. Thus blm(l) 
is the expected value of the length of the mth run (m > 1) . 

In the limiting case g — >■ 1 we can say more since we know Am{z) explicitly: 



Fm — ^m(l) — 



^ / i\m— 7 -m— 7 — 1 



j=0 






and this is exactly the formula that appears in [6] for the instance of permuta- 
tions. 

There, we also learn that Lm — >■ 2; in the general case, we will see in the next 
sections that Lm — >■ 1 -b g. 

^ A gifted former student of electrical engineering (Hermann A.) contacted me years 
after he was in my Discrete Mathematics course, telling me that he found this (or an 
equivalent) formula, but could not prove it. He was quite excited, but after I emailed 
him how to prove it, I never beared from him again. 
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n + k 
n 



4 Solving the Recursion 

The effect of the operator can also be described by the Hadamard 

product (see [3] for definitions) of / with the series 

n 

The reformulation of the recursion is then 

m 

We want to invert this relation to get the from the powers of P. We get 

m 

A^{z) = Y,UkOP^‘ 



where 



-k’ 



1 — k 






Uu = [w’^\ l/Y^Tp 



j>0 



Lemma 1. 



Proof. Since 



Uk = {-irY. 



n + 1 



Un = — ^ UkTn-k, 

fc=0 

it is best to prove the formula by induction, the instance n = 0, i.e. Uq = 1/(1 — 2 :) 
being obvious. 

The righthandside multiplies the coefficient of z” by 



n— 1 






k=0 



n+l\ fn + k — I 

I 



= (-ir^E 

1=0 
'n+1 



n + 1\ f—n — 1 



I 



k-l 



= {-iy 



which finishes the proof. 

Therefore we get the formula 



Proposition 1. 



[z"]ff™(z) = nE(-l)^ 



k—Q 



n + 1 
k 



P 



m—k. 



for n <m this is zero by the combinatorial interpretation or directly. 
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Now this form definitely looks like an outcome of the inclusion-exclusion 
principle. Indeed, there are n + 1 gaps between the n letters, and one can pick 
k of them where the condition that a new run should start is violated. Since we 
don’t need that, we confine ourselves to this brief remark. 

Let us prove that 






m>0 



^ A^(z) = n ^ (-1) 



m>0 



0<fc<m<n 



k f ^ ^ j pm — k 

k 









z=0 



= nE(-i) 






k=l 



k + 1/ ^ — ' V 7 
^ =0 j=0 



= [z”](F- 1)” = 1. 



(Note that P — 1 = z + ) 

Let us also rederive the formula for rl„(l) in the limiting case g — >■ 1; since 



[z-]A^(z) = > (- 1 ) 






fc=0 



— k) n\ 



we have 






m—k 



1 



m — k 



n>0 



(m — A:)! ^ \{n — m -\- k)\ {n -\- 1 — m -\- k) 



fc=o ' " n>0 









^ / 1 \m—k 

= my \ 
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5 A Double Generating Function 



From Proposition 1 we infer that 

/ 1 \k 

Am{z) = Y. 






^ / 1 \k 



k^O 



^ / i\fc ^ ^ ( i\/t 

^ ^ ^ \ j^kpm — k ^ ^ V jjkpm—l — k 






k—0 



Now these forms look like convolutions, and thus we introduce the double 
generating function 



R{w,z) = Y A„i{z)w"^. 

m>0 



Upon summing we find 



R{w, z) = ze 



— wD 



1 

1 — wP 



— we 



—wD 



Z — W 



1 — wP 1 — w( — p{z — w))^ 



the last step was by noticing that e“^ is the shift operator E°- (see e. g. [1]). 

It is tempting to plug in w = 1, because of the summation of the ylm(^)’s, 
but this is forbidden, because of divergence. 

The instance (g — >■ 1) 



R{w, 1) 



1 — w 
1 — we^~'^ 



differs from Knuth’s 



w(l — w) 

-+w 

gU)-l _ yj 

just by 1, because in [6] Lq = 0, whereas here it is Lq = 1. 

Theorem 2. The generating function of the numbers Lm is given by 

1 — w 

1 — w n(i - (w - l)pg*) 

i>0 
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The dominant singularity is at w = 1, and the local expansion starts as 



From this, singularity analysis [5] entails that 

Lm = 1 + <7 + 0{p *”) 

for some p > 1 that depends on q. 

Experiments indicate the expansion (to — >■ oo) 

L^ = l + q- + 2to(?™+" - (1 + 2to")(?'"+ 3 _ 2m^ + 10r»^-15m+9 gm+4 ^ , 

SO that it seems that one could take p = ^ ~ However, that would require a 
proof. 



6 Weakly Ascending Runs 



Relaxing the conditions, we might now say that • • • > ai < • • • < a^. > • • • is a 
run (of length r). 

Many of the previous considerations carry over, so we only give a few remarks. 
The recursion (1) stays the same, but with P = l/(pz)oo- 
With this choice of P the formula 

still holds. 

The bivariate generating function is 

z — w 

l-w/{p{z-w))^' 

The poles of interest are the solutions of 

(l + (w - l)pq'-) = w; 

i>0 



the dominant one is w = 1 with a local expansion 




from which we can conclude that T™ — >■ 1 + ^ . 
And the experiments indicate that 



L^ = l+l- (_l)-g2m-i _ 2(^ _ 1)^ + (^2m^ - 5to + 4)q‘^ + •••). 
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